Skip to main content

Full text of "Introduction to Theoretical Physics"

See other formats



. , _.,. __ ; ,_-_ 

LEE A. DuBRIDGE, Consulting Editor 


The quality of the materials used in the 

manufacture of thin book is governed by 

nnn tinned postwar shortages. 

LEE A. DuBRIDGE, Consulting Editor 

Backer and Goudsmit— Atomic Energy States 

Bitter — Introduction to Ferromacnetism 

Clark — Applied X-rays 

Condon and Morse — Quantum Mechanics 

Curtis— Electrical Measurements 

Davey — Crystal Structure and Its Applications 

Edwards — Analytic and Vector Mechanics 

Eldridge — The Physical Basis op Things 

Hardy and Perrin — The Principles of Optics 

Harnwell — Principles of Electricity and Electro- 

Harnwell and Livingood — Experimental Atomic Physics 

Houston- — Principles of Mathematical Physics 

Hughes and DuBridge — Photoelectric Phenomena 

Hund — -High-frequency Measurements 

Hund — Phenomena in High-frequency Systems 

Kemble — The Fundamental Principles of Quantum 


Kennard — Kinetic Theory of Gases 

Roller- — The Physics of Electron Tubes 

Morse — Vibration and Sound 

Musical — The Flow of Homogeneous Fluids through 
Porous Media 

Pauling and Goudsmit — The Structure of Line Spectra 

Richtmyer and Kennard — Introduction to Modern 

Ruarh and Urey — Atoms, Molecules and Quanta 

Seitz — The Modern Theory of Solids 

Slater — Introduction to Chemical Physics 

Slater — Microwave Transmission 

Slater and Frank — Introduction to Theoretical 


Smythe — Static and Dynamic Electricity 
Stratton — Electromagnetic Theory 
White— Introduction to Atomic Spectra 

Williams — Magnetic Phenomena 

Dr. F. K. Richtmyer was consulting editor of the aeries from its 
inception in 1029 until his death in 1939. 





Professor of Physics, Massachusetts Institute of Technology 


Assistant Professor of Physics, Massachusetts 
Institute of Technology 

First Edition 
Tenth Impression 




McGkaw-Hili, Book Company, Inc. 


All rights reserved. This book, or 

parts thereof, may not be reproduced 

in any form without permission of 

the publishers. 



The general plan of a book is often clearer if one knows how it 
came to be written. This book started from two separate 
sources. First, it originated in a year's lecture course of the 
same title, covering about the first two-thirds of the ground pre- 
sented here, the part on classical physics. This course grew out 
of the conviction that the teaching of theoretical physics in a 
number of separate courses, as in mechanics, electromagnetic 
theory, potential theory, thermodynamics, tends to keep a 
student from seeing the unity of physics, and from appreciating 
the importance of applying principles developed for one branch 
of science to the problems of another. The second source of this 
book was a projected volume on the structure of matter, dealing 
principally with applications of modern atomic theory to the 
structure of atoms, molecules, and solids, and to chemical 
problems. As work progressed on this, it became evident that 
the structure of matter could not be treated without a thorough 
understanding of the principles of wave mechanics, and that 
such an understanding demanded a careful grounding in classical 
physics, in mechanics, wave motion, the theory of vibrating 
systems, potential theory, statistical mechanics, where many 
principles needed in the quantum theory are best introduced. 
The ideal solution seemed to be to combine the two projects, 
including the classical and the more modern parts of theoretical 
physics in a coherent whole, thus further increasing the unity of 
treatment of which we have spoken. 

Two general principles have determined the order of presenting 
the material: mathematical difficulty, and order of historical 
development. Mechanics and problems of oscillations, involving 
ordinary differential equations and simple vector analysis, 
come first. Then follow vibrations and wave motion, intro- 
ducing partial differential equations which can be solved by 
separation of variables, and Fourier series. Hydrodynamics, 
electromagnetic theory, and optics bring in more general partial 
differential equations, potential theory, and differential vector 
operations. Wave mechanics uses almost all the mathematical 
machinery which has been developed in the earlier part of the 
book. It is natural that the historical order is in general tho 


same as the order of increasing mathematical difficulty, for each 
branch of physics as it develops builds on the foundation of 
everything that has gone before. In cases where the two 
arrangements do not coincide, we have grouped together subjects 
of mathematical similarity, thus emphasizing the unity of which 
we have spoken. 

In a book of such wide scope, it is inevitable that many impor- 
tant subjects are treated in a cursory manner. An effort has 
been made to present enough of the groundwork of each subject so 
that not only is further work facilitated, but also the position of 
these subjects in a more general scheme of physical thought is 
clearly shown. In spite of this, however, the student will of 
course make much use of other references, and we give a list of 
references, by no means exhaustive, but suggesting a few titles 
in each field which a student who has mastered the material of 
this book should be able to appreciate. 

At the end of each chapter is a set of problems. The ability 
to work problems, in our opinion, is essential to a proper under- 
standing of physics, and it is hoped that these problems will 
provide useful practice. At the same time, in many cases, the 
problems have been used to extend and amplify the discussion of 
the subject matter, where limitations of space made such dis- 
cussion impossible in the text. The attempt has been made, 
though we are conscious of having fallen far short of succeeding 
in it, to carry each branch of the subject far enough so that 
definite calculations can be made with it. Thus a far surer 
mastery is attained than in a merely descriptive discussion. 

Finally, we wish to remind the reader that the book is very 
definitely one on theoretical physics. Though at times descrip- 
tive material, and descriptions of experimental results, are 
included, it is in general assumed that the reader has a fair 
knowledge of experimental physics, of the grade generally covered 
in intermediate college courses. No doubt it is unfortunate, in 
view of the unity which we have stressed,, to separate the theoret- 
ical side of the subject from the experimental in this way. This 
is particularly true when one remembers that the greatest diffi- 
culty winch the student has in mastering theoretical physics 
comes in learning how to apply mathematics to a physical situa- 
tion, how to formulate a problem mathematically, rather than in 
solving the problem when it is once formulated. We have tried 
wherever possible, in problems and text, to bridge the gap 


between pure mathematics and experimental physics. But the 
only satisfactory answer to this difficulty is a broad training in 
which theoretical physics goes side by side with experimental 
physics and practical laboratory work. The same ability to 
overcome obstacles, the same ingenuity in devising one method 
of procedure when another fails, the same physical intuition 
leading one to perceive the answer to a problem through a mass 
of intervening detail, the same critical judgment leading one to 
distinguish right from wrong procedures, and to appraise results 
carefully on the ground of physical plausibility, are required in 
theoretical and in experimental physics. Leaks in vacuum sys- 
tems or in electric circuits have their counterparts in the many 
disastrous things that can happen to equations. And it is often 
as hard to devise a mathematical system to deal with a difficult 
problem, without unjustifiable approximations and impossible 
complications, as it is to design apparatus for measuring a diffi- 
cult quantity or detecting a new effect. These things cannot be 
taught. They come only from that combination of inherent 
insighl and faithful practice which is necessary to the successful 
physicist. But half the battle is over if the student approaches 
theoretical physics, not as a set of mysterious formulas, or as a 
dull routine to be learned, but as a collection of methods, of tools, 
of apparatus, subject to the same sort of rules as other physical 
apparatus, and yielding physical results of great importance. 
The title of this book might have been aptly extended to "Intro- 
duction to the Methods of Theoretical Physics," for the aim has 
constantly been, not to teach a great collection of facts, but to 
teach mastery of the tools by which the facts have been dis- 
covered and by which future discoveries will be made. 

In a subject about which so much has been written, it seems 
hardly practicable to acknowledge our indebtedness to any 
specific books. From many of those mentioned in the section 
on suggested references, and from many others, we have received 
ideas, though the material in general has been written without 
conscious following of earlier models. We wish to express 
thanks to several of our colleagues for suggestions, and partic- 
ularly to Professors P. M. Morse and J. A. Stratton, who have 
read the manuscript with much care and have contributed greatly 
by their discussions. 

Cambridge, Mass., J. C, S. 

September, 1933. N. H. F. 



Pkeface . V 

Chapter I 

Introduction 1 

1. Power Series 2 

2. Small Quantities of Various Orders 3 

3. Taylor's Expansion 4 

4. The Binomial Theorem 4 

6. Expansion about an Arbitrary Point 4 

6. Expansion about a Pole 5 

7. Convergence 5 

Problems 8 

Chapter II 

Introduction, ■ • 10 

8. The Fallino Body. 11 

9. Falling Body with Viscosity 11 

10. Particular and General Solutions for Falling Body with 

Viscosity, , 14 

11. Electric Circuit Containing Resistance and Inductance 16 
Problems 17 

Chapter III 


Introduction. 19 

12. Particle with Linear Restoring Force 19 

13. Oscillating Electric Circuit 20 

14. The Exponential Method of Solution 21 

15. Complex Exponentials 22 

16. Complex Numbers 23 

17. Application of Complex Numbers to Vibration Problems 25 
Problems 26 

Chapter IV 



Introduction 27 

18. Damped Vibrational Motion 27 



19. Damped Electrical Oscillations 28 

20. Initial Conditions for Transients. 29 

21. ['"uiu.'ED Vibrations and Resonance 29 

22. Mechanical Resonance 30 

23. Electrical Resonance 31 

24. Superposition of Transient and Forced Motion 33 

25. Motion under General External Forces 35 

26. Generalizations Regarding Linear Differential Equations 36 
Problems 37 

Chapter V 

Introduction 39 

27. Mechanical Energy 40 

28. Use of the Potential for Discussing the Motion of a System 42 

29. The Rolling-ball Analogy 45 

30. Motion in Several Dimensions. 46 

Problems 46 

Chapter VI 

Introduction. 48 

31. Vectors and Their Components 48 

32. Scalar Product of Two Vectors 49 

33. Vector Product of Two Vectors ■ ■ 50 

34. Vector Fields 51 

35. The Energy Theorem in Three Dimensions 52 

36. Line Integrals and Potential Energy 52 

37. Force as Gradient of Potential 53 

38. Equipotential Surfaces 54 

39. The Curl and the Condition for a Conservative System 55 

40. The Symbolic Vector V 55 

Problems ^ 

Chapter VII 

Introduction 58 

41. Lagrange's Equations 58 

42. Planetary Motion **0 

43. Energy Method for Radial Motion in Central Field ... 61 

44. Orbits in Central Motion °2 

45. Justification of Lagrange's Method 64 

Problemb "' 

Chapter VIII 

46. Generalized Forces. 




47. Generalized Momenta 70 

48. Hamilton's Equations of Motion 71 

49. General Proof of Hamilton's Equations 72 

50. Example of Hamilton's Equations 74 

51. Applications of Lagrange's and Hamilton's Equations ... 75 
Problems. . . 76 

Chapter IX 

Introduction 79 

52. The Phase Space 80 

53. Phase Space for the Linear Oscillator 81 

54. Phase Space for Central Motion 82 

55. noncentkal two-dimensional motfon 83 

56. Configuration Si'ace and Momentum Space 83 

57. The Two-dimensional Oscillator 84 

5S. Methods of Solution 86 

59. Contact Transformations and Angle Variables 87 

60. Methods of Solution for Nonperiodic Motions 90 

Problems 90 

Chapter X 

Introduction 92 

61. Elementary Theory of Precessing Top 92 

62. Angular Momentum, Moment of Inertia, and Kinetic Energy 94 

63. The Ellipsoid of Inertia; Principal Axes of Inertia .... 95 

64. The Equations of Motion 96 

65. Euler's Equations 98 

66. Torque-free Motion of a Symmetric Rigid Body 98 

67. Euler's Ancles 100 

68. General Motion of a Symmetrical Top under Gravity . . . 102 

69. Precession and Nutation 104 

Problems 105 

Chapter XI 

Introduction 107 

70. Coupled Oscillators 107 

71. Normal Coordinates Ill 

72. Relation of Problem of Coupled Systems to Two-dimen- 

sional Oscillator 114 

73. The General Problem of the Motion of Several Particles 117 
Problems 118 

Chapter XII 

Introduction 120 

74. Differential Equation of the Vibrating String 120 



75. The Initial Conditions for the String 122 

76. Fourier Series 123 

77. Coefficients of Fourier Series 124 

78. Convergence of Fourier Series 125 

79. Sine and Cosine Series, with Application to the String 126 

80. The String as a Limiting Problem of Vibration of Particles 128 

81. Lagrange's Equations for the Weighted String 131 

82. Continuous String as Limiting Case 131 

Problems 132 

Chapter XIII 

Introduction 134 

83. Normal Coordinates 134 

84. Normal Coordinates and Function Space 137 

85. Fourier Analysis in Function Space 139 

86. Equations of Motion in Normal Coordinates 140 

87. The Vibrating String with Friction 142 

Problems 144 

Chapter XIV 

Introduction 146 

88. Differential Equation for the Variable String 146 

89. Approximate Solution for Slowly Changing Density and 

Tension 147 

90. Progressive Waves and Standing Waves 149 

91. Orthogonality of Normal Functions 151 

92. Expansion of an Arbitrary Function Using Normal Func- 

tions. 152 

93. Perturbation Theory 154 

94. Reflection of Waves from a Discontinuity 156 

Problems 158 

Chapter XV 

Introduction 160 

95. Boundary Conditions on the Rectangular Membrane . . . 160 

96. The Nodes in a Vibrating Membrane 162 

97. Initial Conditions 162 

98. The Method of Separation of Variables 163 

99. The Circular Membrane 164 

100. The Laplacian in Polar Coordinates. 164 

101. Solution of the Differential Equation by Separation. . . 165 

102. Boundary Conditions . 166 

103. Physical Nature of the Solution 167 

104. Initial Condition at t = 168 



105. Proof of Orthogonality of the J's 169 

Problems. . 170 

Chapter XVI 

Introduction 172 

106. Stresses, Body and Surface Forces 172 

107. Examples of Stresses. 174 

108. The Equation of Motion 175 

109. Transversa Waves 176 

110. Longitudinal Waves 178 

111. General Wave Propagation 179 

112. Strains and Hooke's Law 180 

113. Young's Modulus 182 

Problems 183 

Chapter XVII 

Introduction 185 

114. Velocity, Flux Density, and Lines of Flow 185 

115. The Equation of Continuity 186 

116. Gauss's Theorem 187 

117. Lines of Flow to Measure Rate of Flow 188 

118. Irrotational Flow and the Velocity Potential 188 

119. Euler's Equations of Motion for Ideal Fluids 190 

120. Irrotational Flow and Bernoulli's Equation 191 

121. Viscous Fluids 192 

122. Poiseuille's Law 194 

Problems 195 

Chapter XVIII 

Introduction 197 

123. Differential Equation of Heat Flow 197 

124. The Steady Flow of Heat 198 

125. Flow Vectors in Generalized Coordinates 199 

126. Gradient in Generalized Coordinates. 200 

127. Divergence in Generalized Coordinates 200 

128. Laplacian 201 

129. Steady Flow of Heat in a Sphere 201 

130. Spherical Harmonics 202 

131. Fourier's Method for the Transient Flow of Heat .... 203 

132. Integral Method for Heat Flow 205 

Problems 209 

Chapter XIX 

t ntroduction 210 

133. The Divergence of the Field . . 210 



134. The Potential , 211 

135. Electrostatic Problems without Conductors 212 

130, Electrostatic Problems with Conductors 215 

137. G keen's Theorem -. • 217 

138. Proof of Solution of Pqisson's Equation . . . • 217 

139. Solution of Poisson's Equation in a Finite Region 220 

140. Green's Distribution. 221 

141. Green's Method of Solving Differential Equations. . . . 222 
Problems 223 

Chapter XX 



Introduction • 225 

142. The Magnetic Field of Currents 226 

143. Field of a Straight Wire 228 

144. Stokes's Theorem 229 

145. The Curl in Curvilinear Coordinates 229 

146. Applications of Stokes's Theorem 230 

147. Example: Magnetic Field in a Solenoid 231 

148. The Vector Potential 231 

149. The Biot-Savart Law 233 

Problems 234 

Chapter XXI 


Introduction 235 

150. The Differential Equation for Electromagnetic Induction 235 

151. The Displacement Current 236 

152. Maxwell's Equations 239 

153. The Vector and Scalar Potentials 241 



Chapter XXII 
Introduction 246 

154. Energy in a Condenser 246 

155. Energy in the Electric Field 247 

156. Energy in a Solenoid 248 

157. Energy Density and Energy Flow 249 

J 58. Poynteng's Theorem • • 250 

159. The Nature of an E.M.F 250 

160. Examples of Poynting's Vector 251 

161. Energy in a Plane Wave . 253 

162. Plank Waves in Metals 255 

Problems 256 


Chapter XXIII 


WAVES Paui , 

Introduction 258 

103, Boundary Conditions at a Surface of Discontinuity. . . . 258 

164. The Laws of Reflection and Refraction 259 

165. Reflection Coefficient at Normal Incidence 260 

166. Fresnel's Equations 202 

167. The Pol arizing Angle 264 

168. Total Reflection 265 

169. The Optical Behavior of Metals 267 

Problems 268 

Chapter XXI V 
Introduction 270 

170. Polarization and Dielectric Constant 271 

171. The Relations of P, E, and D 273 


173. Dispersion in Gases 275 

174. Dispersion of Solids and Liquids 278 

175. Dispersion of Metals 280 

Problems .- 283 

Chapter XXV 

Introduction. 286 

176. Spherical Solutions of the Wave Equation 286 

177. Scalar Potential for Oscillating Dipole 288 

178. Vector Potential. 289 

179. The Fields. 290 

180. The Hertz Vector ,..:... 291 

181. Intensity of Radiation from a Dipole 293 

182. Scattering of Light 293 

183. Polarization of Scattered Light 295 

184. Coherence and Incoherence of Light 295 

185. Coherence and the Spectrum 298 

186. Coherence of Different Sources 299 

Problems 299 

Chapter XXVI 

Introduction 302 

187. The Retarded Potentials 303 

188. Mathematical Formulation of Huygens* Principle 305 

189. Application to Optics 307 

190. Integration for a Spherical Surface by Fresnel's Zones 308 



191. The Use of Huygens' Principle 310 

192. Huygens' Principle for Diffraction Problems 310 

193. Qualitative Discussion of Diffraction, Using Fresnel's 

Zones 31 1 

Problems 314 

Chapter XXVII 
Introduction 31 g 

194. Comparison of Fresnel and Fraunhofer Diffraction. . . .315 

195. Fresnel Diffraction from a Slit 319 

196. Cornu's Spiral 320 

197. Fraunhofer Diffraction from Rectangular Slit 323 

198. The Circular Aperture 324 

199. Resolving Power of a Lens 325 

200. Diffraction from Several Slits; the Diffraction Grating 326 
Problems 328 

Chapter XXVIII 

Introduction 329 

201. The Quantum Hypothesis 330 

202. The Statistical Interpretation of Wave Theory 332 

203. The Uncertainty Principle for Optics 333 

204. Wave Mechanics 335 

205. Frequency and Wave Length in Wave Mechanics 337 

206. Wave Packets and the Uncertainty Principle 337 

207. Fermat's Principle 339 

208. The Motion of Particles and the Principle of Least Action 342 
Problems. 343 

Chapter XXIX 
Introduction. 345 

209. Scitrodinger's Equation 345 

210. One-dimensional Motion in Wave Mechanics 346 

211. Boundary Conditions in One-dimensional Motion 350 

212. The Penetration of Barriers 351 

213. Motion in a Finite Region, and the Quantum Condition . . 353 

214. Motion in Two or More Finite Regions 355 

Problems 356 

Chapter XXX 



Introduction 358 

215. The Quantum Condition in the Phase Space ....*.... 358 

216. Angle Variables and the Correspondence Principle, . . . 359 



217. The Quantum Condition for Several Degrees of Freedom 361 

218. Classical Statistical Mechanics in the Phase Space .... 364 

219. Liouvtlle's Theorem 365 

220. Distributions Independent of Time 366 

221. The Microcanonical Ensemble 367 

222. The Canonical Ensemble 368 

223. The Quantum Theory and the Phase Space 369 

Problems. 371 

Chapteh XXXI 

Introduction. • 374 

224. Mean Value of a Function of Coordinates 374 

225. Physical Meaning of Matrix Components 375 

226. Initial Conditions, and Determination of c's 377 

227. Mean Values of Functions of Momenta 379 

228. Schrodinger's Equation Including the Time 381 

229. Some Theorems Regarding Matrices 382 

Problems ... 384 

Chapter XXXII 

Introduction, 386 

230. The Secular Equation of Pertdrbation Theory 386 

231. The Power Series Solution 387 

232. Perturbation Theory for Degenerate Systems 390 

233. The Method of Variation of Constants 391 

234. External Radiation Field 392 

235. Einstein's Probability Coefficients 393 

236. Method of Deriving the Probability Coefficients. .... 395 

237. Application of Perturbation Theory 396 

238. Spontaneous Radiation and Coupled Systems 399 

239. Applications of Coupled Systems to Radioactivity and 

Electronic Collisions 402 

Problems 404 

Chapter XXXIII 

Introduction. . . 406 

240. The Atom and Its Nucleus 406 

241. The Structure of Hydrogen 407 

242. Discussion of the Function of r for Hydrogen 410 

243. The Angular Momentum 414 

244. Series and Selection Principles ( ... 416 

245. TnE General Central Field 418 

Problems 423 


Chapter XXXIV 

Introduction. . 425 

246. Tub Periodic Table 42f> 

247. The Mktitod of Self-consistent Fields 430 

248. Effective Nuclear Charges 431 

249. The Many-body Problem in Wave Mechanics 432 

250. SciirGdinger's Equation and Effective Nuclear Charges, . 433 

251. Ionization Potentials and One-electron Energies 435 

Problems 437 

Chapter XXXV 


Introduction 439 

252. Ionic Forces 439 

253. Polarization Force. . 439 

254. Van der Waals' Force 440 

255. Penetration or Coulomb Force 442 

256. Valence Attraction 442 

257. Atomic Repulsions 444 

258. Analytical Formulas for Valence and Repulsive Forces. . 444 

259. Types of Substances: Valence Compounds 447 

260. Metals 449 

261. Ionic Compounds 449 

Problems 451 

Chapter XXXVI 

Introduction. 454 

262. Gases, Liquids, and Solids 454 

263. The Canonical Ensemble 456 

264. The Free Energy 458 

265. Properties of Perfect Gases on Classical Theory 461 

266. Properties of Imperfect Gases on Classical Theory . . . 462 

267. Van der Waals' Equation 464 

268. Quantum Statistics 466 

269. Quantum Theory of the Perfect Gas 468 

Problems 470 

Chapter XXXVII 

Introduction 471 

270. The Crystal at Absolute Zero 472 

271. Temperature Vibrations of a Crystal 474 

272. Equation of State of Solids 478 

273. Vibrations of Molecules 480 



274. Diatomic Molecules 481 

275. Specific Heat of Diatomic Molecules 483 

27G. Polyatomic Molecules ," 485 

Problems • 486 

Chapter XXXVIII 
Introduction 488 

277. Chemical Reactions 488 

278. Collisions with Electronic Excitation 491 

279. Electronic and Nuclear Energy in Metals 494 

280. Perturbation Method for Interaction of Nuclei 497 

Problems 499 

Chapter XXXIX 

Introduction ■ SOI 

281. The Exclusion Principle 502 

282. Results of Antisymmetry of Wave Functions ....... 506 

283. The Electron Spin 507 

284. Electron Spins and Multiplicity of Levels 509 

285. Multiplicity and the Exclusion Principle . 510 

288. Spin Degeneracy for Two Electrons •. .512 

287. Effect of Exclusion Principle and Spin 514 

Problems 516 

Chapter XL 
Introduction 518 

288. Atomic Energy Levels 518 

289. Spin and Orbital Degeneracy in Atomic Multiplets .... 520 

290. Energy Levels of Diatomic Molecules 522 

291. Heitler and London Method for H 2 523 

292. The Method of Molecular Orbitals 527 

Problems 530 

Chapter XLI 

Introduction 531 

293. The Exclusion Principle for Free Electrons 531 

294. Maximum Kinetic Eneroy and Density of Electrons. . . . 534 

295. The Fermi-Thomas Atomic Model 535 

296. Electrons in Metals 536 

297. The Fermi Distrlbution. 540 

Problems 543 


Chapter XLII 


Introduction , 545 

298. Dispersion and Dispersion Electrons 546 

299. Quantum Theory of Dispersion . 548 

300. polarizability 549 

301. Van der Waals' Force 551 

302. Types of Dielectrics 553 

303. Theory of Dipole Orientation 554 

304. Magnetic Substances 556 

Problems 558 

Suggested References. 561 

Index 565 





The first result of a physical experiment is ordinarily a table 
of values, one column containing values of an independent 
variable, another of a dependent variable. In mechanics, the 
independent variable is ordinarily the time, the dependent 
variable the displacement. In thermodynamics, we may have 
two independent variables, as volume and temperature, and one 
dependent variable, the pressure. With electric currents, we 
may have the current flowing in some part of the circuit as 
dependent variable, the electromotive force applied as inde- 
pendent variable, as when in a vacuum tube we measure plate 
current as function of grid voltage. In electromagnetic theory, 
the electric or magnetic field strength, the dependent variable, 
is a function of four independent variables, the three coordinates 
of space, and time. 

The relation between independent and dependent variable can 
be given by a table of values, by drawing a graph, or analytically 
by approximating the results by a mathematical formula. The 
last method is by far the most powerful, particularly if further 
calculations must be made using the experimental results, so that 
we are led to the study of mathematical functions. There are a 
good many well-known functions; for example, the algebraic func- 
tions, as ax + bx 2 ; the trigonometric functions, as sin (ax + 6); 
exponential functions, as ae -6 *; and rarer things like Bessel's 
functions, J n (x). It may be that, by inspection of the results, 
or for some theoretical reason, we may decide that some such 
well-known function can be used to describe our experimental 
data within the experimental error. But in actual physical 



problems, we meet many functions which are not included among 
these well-known forms. The question presents itself, can we 
' not get some general method of describing functions analytically, 
equally applicable to familiar and unfamiliar functions? 

1. Power Series.— Power series present one such general 
method, on the whole the most useful one. The simplest form 
of power series is A + A x x + A 2 x 2 + * • • , where the A's 
are arbitrary coefficients. By giving these coefficients suitable 
values, we can make the series approach any desired function 
as closely as we please, with some exceptions as we shall note 
below. As examples of common series, we have first the poly- 
nomials (in which all A„'s after a certain n are zero); and then 
many familiar infinite series, as 

(1 + x) n = 1 + nx + — ^ — 2| — x + 31 x + 

2 3 " ' ' (1) 

e* - i + a; + |!+|! + . . • , (2) 

COS X = 1 - 7^ + Ti _ ^1 + ' ' * » ( 3 ) 

sin x = x — g| + ^ j — • • • (4) 

In fitting an experimental table of values, it is generally true 
that we cannot use one of these well-known series. We must 
determine coefficients to fit the data. A familiar process is that 
in which we know beforehand that the graph of the function 
should be a straight line. Then, either by actually plotting and 
estimating by means of a ruler, or by using least squares, we 
find the two constants of the linear relation y = a + bx. If 
the graph is slightly curved, we may be able to determine the 
constants of a parabola y = a + bx + ex 2 to fit it approximately. 
More complicated curves can be approximated by taking more 
terms. It is plain that, if there are n points determined experi- 
mentally, we can find a polynomial containing n coefficients which 
will just pass through them. But this is hardly a sportsmanlike 
thing to do, and generally we look for a function containing 
far fewer constants than the number of points we wish to -fit. 
In other words, in practice, rather than using infinite series, we 
are accustomed to use only the first few terms of such a series, , 


2. Small Quantities of Various Orders. — The general justifica- 
tion of this method of using only a finite part of a series comes 
from considering small quantities of various orders, as they are 
called. A power series is practically useful only if it converges 
rather rapidly; that is, if each term is decidedly smaller than the 
one before it. If we imagine that a physical relation is really 
expressed by a rapidly converging infinite series, then the sum 
of all the terms after a certain one will be smaller than the 
inevitable errors of experiment, and may be neglected, leaving 
only a polynomial. Suppose, for instance, that the linear dimen- 
sion d of a solid under pressure, expressed as a function of the 
pressure p, is given exactly by a series d = d Q — ap + bp 2 — 
• • • . For small pressures, the change of length ap will be 
small compared with d , and the second-order term bp 2 will be 
in turn small compared with ap (though of course this will not 
be true for much higher pressures, since ap will increase, and 
bp 2 will increase even more). We express this by saying that 
ap is small quantity of the first order, bp 2 a small quantity of 
the second order. It may well be that the second-order quantities 
are so small that we can neglect them, so that approximately 
d = d — ap. Now if we are interested in finding the way in 
which the volume, proportional to d 3 , changes with pressure, 
we have accurately 

^3 = do s _ 3do * ap + (Zd a 2 + 3d 2 b)p 2 + • • • . (5) 

But we are assuming that ap is small compared with d , and 
bp 2 is small compared with ap, for all pressures used. Thus we 
readily see that the term in p 2 in this final expression (5) is 
small compared with the term in p, and can be neglected in 
comparison with the leading term d 3 , so that in d 3 , as in d, we 
can neglect the second order of small quantities. We could 
then have started with the abbreviated expression d = d — ap, 
and have obtained the same result for d 3 , to the first order. 

This method of cutting off infinite series at definite places, 
retaining only terms of a certain order, is very commonly used, 
and often is the only thing that simplifies computations with 
series enough to make them practically possible. But we must 
notice that the justification depends entirely on the physical 
situation, and can be different in different cases. Thus if we 
had to consider higher pressures in our problem above, we should 
have to retain the second-order terms, but perhaps could neglect 


third-order ones. One must always use good physical judgment 
in neglecting small quantities. Now, of course, in many cases 
we do not need to neglect high powers at all. The problems 
which we meet will often have simple enough relations between 
the coefficients of the successive terms so that we can write 
down as many terms as we please, without trouble, as we can 
with the binomial or exponential series. But it always pays 
to inquire, if the high terms of the series get too complicated 
to work with successfully, if they cannot be neglected. 

3. Taylor's Expansion. — We have been speaking of series 
representing functions obtained from experiment, or about whiph 
we do not have much information. But it may be that we have 
to work with a function whose analytical properties we know, 
and in that case there is a standard method of finding its series 
expansion, known as Taylor's theorem. This is as follows: 

/(*) =/(o) +f(p)x +^r^ 2 +-^r* 3 +-.••, W 

where f{x) is the function of x, /(0) means the value of the func- 
tion when x = 0, /'(0) is the first derivative for x = 0, and so on, 
so that f(x) = A + Aix + A 2 x 2 +•••', where A n = f n (0)/nl 
To justify this, we need only differentiate n times, obtaining 
very easily 

/"Or) = n(n - 1) • • • (2)(l)A n + (n + l)(n) • • • (2)A n+1 x 

+ (n + 2)(n + 1) • • • (S)A n+2 x* + • • • 

= n\A n + ^ n >' A n+1 x + • • • . 

If now we let x = 0, all terms but the first vanish, so that we have 
/*(0) = n\ A n , or A n =/»(0)/n!. 

4. The Binomial Theorem. — As an illustration of Taylor's 
expansion, we prove the binomial theorem, the expansion of 
(1 + x) n given in Eq. (1). We have 

f(x) = (1 + *)», 

fix) = nil + x) n ~\ 

fix) = nin - 1)(1 + xY~\ 
etc., by differentiation. Thus, setting x = 0, (1 + x) goes into 
1, so that we have /(0) = 1, jf'(O) = n, f?(0) = n(n - 1), 
etc., and A = 1, Ai = n/l\, A 2 = n(n — \)/2\, etc. 

5. Expansion about an Arbitrary Point.— A slightly more 
general expansion is obtained by shifting the origin along the x 
axis to a point a. The expansion is 


fix) = f(a) +f{a)(x - 'a) +^p(^ - a) 2 + • • • (7) 

From Taylor's theorem, we can see immediately a general 
condition which a function must satisfy if it can be expanded 
in power series about a given point (by expanding about a point 
we mean setting up an expansion in powers of x — a, if a is 
the given point). The function and all its derivatives must be 
finite at the point in question, since otherwise some coefficients 
of the expansion will be infinite. Thus for example we cannot 
expand 1/x in power .series in a;: we have/(0) = 1/0 = infinite, 
and all the derivatives are also infinite. Such a point is called 
a singular point of the function. But by expanding about 
another point we can avoid this difficulty. Thus we can expand 
1/x about a, if a ^ 0; 

/(a) = 1/a, /'(a) = -1/a 2 , /"(a) = 1-2/a 3 , /'"(a) = -1-2-3M 

etc., so that 

1 ' 1 _ (x - a) , {x - a) 2 _ (x - a) z . . 

x a a 2 ^ a 3 a 4 " r " w 

From this we can understand that a function can be expanded in 
power series about a point which is not a singular point. 

6. Expansion about a Pole. — At some singular points, the 
function behaves like l/x n , an inverse power of x. Such a 
singularity is called a pole. If fix) has a pole of order n at the 
origin, then by definition x n f(x) has no singularity at the origin, 
and can be expanded in power series A + A\x • • • . Thus 
we have for f(x) the expansion 

an infinite series starting with inverse powers, but turning into 
anordinary series of positive powers after its nth term. A similar 
theorem holds for expansion about a pole at x — a. A singularity 
which is not a pole is called an essential singularity. An example 
of an essential singularity is that possessed by the function 
e~ 1/x at x = 0. This function approaches as # approaches 
through positive values, but becomes infinite as x approaches 
through negative values, and no inverse power 1/x" has such a 

7. Convergence. — A series is said to converge if the process of 
adding its terms is one that can be carried out and that leads to a 


definite answer. Thus (1 — x)~ l , by the binomial theorem, is 
equal to 1 + x + x 2 + x 3 + • • • .' Now if x is less than unity, 
and we try to add these terms, we get an answer. For example, 
if x = 0.1, we have 1 + 0.1 + 0.01 + 0.001 + 0.0001 + • • • = 
1.111 • • • , which equals (1 — 0.1) -1 = 1%, as it should. 
But if x is greater than unity, this no longer holds: if x = 2, 
we have 1 + 2 + 4 + 8+ •••, which certainly is infinitely 
great, and leads to no definite value. Another situation is 
obtained if we set x = — 1 in the series, when we have 1 — 1 + 
1 — 1 + 1 • • • , a series which is said to oscillate (successive 
terms have opposite signs). As a matter of fact, we find that 
the series 1 + x + x 2 + x 3 + • • • , which is called the geometric 
series, converges if x is between —1 and +1, but does not 
converge if x is equal to or greater than 1, or equal to or less 
than —1. This series illustrates two of the simplest types of 
nonconvergence of series, the simple divergence, in which terms 
get greater and greater, and the oscillation, where the terms have 
the same order of magnitude but alternate sign. There is still 
another type of series which does not converge, sometimes called 
the semiconvergent or asymptotic series, whose terms begin to 
decrease regularly as we go out in the series, but after a certain 
point start in increasing, and eventually become infinite. These 
asymptotic series often can be used for computation, for it can 
be shown in many cases that, if we retain terms just up to the 
smallest one, the resulting sum is a good approximation to the 
function the series is supposed to represent. 

Our definition of convergence in the last paragraph was very 
crude. More exactly, a series converges if the sum of the first n 
terms approaches a limit as n increases indefinitely. This defini- 
tion agrees with the usual procedure of the physicist, for he often 
computes by series, and he does it by adding a finite number of 
terms. He carries this far enough so that adding more terms 
does not change the sum, to the order of accuracy to which he 
works, which essentially means that the sum is approaching 
a limit. 

To tell whether a given series converges is not always easy. 
In the first place, we can be sure in some cases that given Taylor's 
expansions cannot converge if the argument (that is, the inde- 
pendent variable), has too large a value. Thus 1 + x + x 2 • • • 
does not converge if x is equal to, or greater than, 1, and we could 
have seen this from the fact that the series equals 1/(1 — x), 


which has a singularity for x = 1 (being equal to %). Thus the 
function is infinite for re = 1, and the series to represent it could 
not converge. And increasing x beyond 1 cannot make the series 
converge again. In fact, as soon as the variable in a series 
becomes greater than the value for which the function has a 
singularity, the series will diverge. But it is a little more com- 
plicated than this would seem, for 1 + x + x 2 + • • • diverges 
also for x less than —1, and there is no singularity here. As a 
matter of fact, a power series converges in general so long as the 
argument is less in absolute value than the smallest value for 
which there is a singularity, but not beyond. But this singu- 
larity can come from imaginary or complex values of the argu- 
ment, so that we might well miss it completely if we did not 
consider imaginary values. For this reason, this criterion for 
convergence is rather tricky. 

When we actually examine a series, we can often tell whether 
it converges or not. Surely a series cannot converge unless its 
successive terms get smaller and smaller. We can investigate 
this by the ratio test, taking the ratio of the nth term to the one 
before, and seeing how this ratio changes as we go out in the series. 
If the limiting ratio is less than 1, the series converges; if it is 
greater than 1, it diverges. If the ratio is just 1, the test gives 
no information. Thus for example with the series x + x 2 /2 + 
x 3 /3 + • • • , the ratio of the term in x n to that in x n ~ l is 

— — ^zj- = x. As n approaches infinity, n — 1 and n 

tb 00 lb 

become approximately equal, so that the ratio approaches x. 
Thus we see that if z is less numerically than unity, this series 
converges; if x is greater than unity, it diverges; if x = 1, we 
cannot say. From other information, we know that the series 
when x = 1, which is 1 + 1/2 + 1/3 + 1/4 + • • • , diverges. 
But with the similar series x + x 2 /2 2 + x s /3 2 + • • • , where the 
ratio of terms also approaches x as we go out in the series, and 
the series again diverges for x greater numerically than unity, 
converges for x less than unity, we have just the other situation 
at x = 1: the series 1 + 1/2 2 + 1/3 2 + • • • converges. 

Often a series can be approximately summed by comparison 
with an integral. Thus 

1 + 2 n + 3^ + ' ' " = 2^ = J ^ approximately. 


The approximation is rather poor for the small values of z, but 
becomes better for large z values, on which the convergence 
depends. It would be a better approximation, for instance, to 

write ttt + t^ — h ' • ' = I — • From this we see that the 
10 n ll n J 10 z n 

series converges when n > 1, the integral being -. — _ ..^ W-I 

which is zero at the upper limit, but diverges if n Z 1. For 
n — 1, for instance, the integral becomes logarithmically infinite 

at z = oo . 


1. Plot 1 as a function of x, and show that it has a minimum at 

x a, 2 

x = 2. Expand in Taylor's series about this point, obtaining an expansion 
y = A + Az(x — 2) 2 + A s (x — 2) 3 + • • • , where necessarily the coeffi- 
cient A-l is zero. Now plot on the graph the successive approximations 
y = A , y = A + A 2 (x - 2) 2 , y = A + A 2 (x - 2) 2 + A,(x - 2) 3 , y = 
A + A 2 (x — 2) 2 + A 3 (x — 2) 3 + A 4 (x — 2) 4 , observing how they approxi- 
mate the real curve more and more accurately. 

2. a. Derive the series for the exponential, cosine and sine series, directly 
from Taylor's theorem. 

6. Differentiate the series for sin x term by term, and show that the 
result is the series for cos x. 

3. In the series for e x , set x = 1, obtaining a series for e. Using this series, 
compute the value of e to four decimal places. 

4. Why does one always have series for In (1 + x) in powers of x, rather 
than for In x? From the series for In (1 + x), compute logarithms to 
base e of 1.1, 1.2, 1.3, 1.4, 1.5. 

6. The function l/(x — i), where i = \/— 1> has a singularity for x = i, 
but not for any real value of x. Show that nevertheless the series expansion 
about x = diverges for x greater than 1 or less than —1, obtaining the 
power series by Taylor's theorem, and separating real and imaginary parts 
of the series. This is an example of a case where the series diverges on 
account of singularities for complex values of x. 

6. As a result of an experiment, we are given the table of values following: 
























Try to devise some practicable scheme for telling whether this function (in 
which, being a result of experiment, the values are only approximations), 
can be represented within the error of experiment by a linear, quadratic, 
cubic, etc., polynomial. Get the coefficients of the resulting series, and use 
them to find the value of the function and its slope at x = 0. Plot the 
points, the curve which approximates them, and the straight-line tangent to 
the curve at x = 0. It is legitimate to use graphical methods if you wish. 
7. Expand tan -1 a; in a power series about x = 0. Hints: 

(a) -j- tan" 1 (x) = 

dx v ' 1 + x % 

w r^-2 = 1 ~ x * + xi - ** + • 

t- (tan -1 x) dx = tan -1 x + c. 


What is the range of convergence of the resulting series? Calculate from 
this series the value of tt/4 = tan -1 1 correct to 5 per cent. How many terms 
of the series are necessary to obtain this accuracy? 

8. By a procedure analogous to that used in Prob. 7 expand sin -1 sin a 
power series about x — 0. Find the range of convergence for this series. 

9. From the known Taylor's series for e x , write the corresponding series 
for e~» 2 . By integrating this series obtain to 1 per cent a value for 

,o e ~ x2dx > 

whose correct value is 0.748. ... 

10. Make use of the binomial theorem to obtain an expansion of 

VI + -y/x in ascending powers of xV*. What is the range of convergence? 

11. Discuss by the ratio test the convergence of the following series: 

(a.) x + x*/2 + x 3 /3 + xV4 + • • • 
(6) x + x72 2 + a;V3 2 + x*/4* + • • • 

(c) The binomial expansion of (1 + x) k , for nonintegral k. 

(d) The series for e x , sin x, cos x. 




Most important physical laws involve statements giving the 
relation between the rate of change of some quantity and other 
quantities. Such a relation, stated in mathematical language, 
is a differential equation — an equation containing derivatives of 
functions, as well as the functions themselves. For example, 
the fundamental law of mechanics is Newton's second law of 
motion: the force equals the time rate of change of the 
momentum. Or in electricity, in a circuit containing an 
inductance, the back electromotive force of the inductance equals 
a constant times the time rate of change of the current. But 
these differential relations are not in the form which can be used 
in making direct connection with experiment. One cannot 
directly plot graphs, or give tables of values, from them. One 
must rather solve the differential equations, that is, find algebraic 
relations between the variables, containing no differentiations, 
but consistent with the differential equations. For most of our 
course we shall be interested in finding such solutions of differen- 
tial equations. 

Solving differential equations is rather like integrating func- 
tions: there are no general rules. Individual cases must be 
treated by appropriate special methods. We shall meet some 
such special rules, and shall make much use of some of them. 
Those who have studied differential equations have learned a 
variety of such rules. But rather more important on the whole 
is a method which is applicable, though not always most con- 
venient, in a very large number of cases: the method of power 
series. In general, the solution of a differential equation consists 
of a certain functional relation between variables. If we assume 
that this function is expanded in power series, our only problem 
is to determine the coefficients. And by substituting the series 
back into the differential equation, we can very often get condi- 
tions for determining them. We shall illustrate the method 
by examples. 



8. The Falling Body. — Imagine a body moving vertically 
under the action # of gravity. To describe its motion, we have an 
independent variable, the time t, and a dependent variable, the 
height x. Let the mass of the body be m, and let its velocity, 
which is of course dx/dt, be also called v. The force acting on 

it is F. Then Newton's law states that F = \, J > where mv 


is the momentum. If the mass is constant (which does not 
always have to be the case, as we shall see in Prob. 7), we can 
rewrite the equation as F = mdv/dt, or =ma, where a is the 
acceleration. Substituting v = dx/dt, this is also F = md 2 x/dt 2 . 
These are all forms of Newton's second law, written as differential 
equations. We shall first take the case where the force, like 
that of gravity on the earth's surface, is constant : F = constant = 
— mg, where g is the acceleration of gravity, and where the nega- 
tive sign means that the force is downward. Then we have 

t? dv d 2 x ... 

F=-mg = m m = m^, (1) 

or d 2 x/dt 2 = dv/dt = — g. These can be solved at once, by direct 
integration: integrating once with respect to t, dx/dt = v = 
constant —gt = v — gt, where y , the constant of integration, 
obviously means the value of the velocity when t = 0. Integrat- 
ing again, and calling the second constant of integration x , we 
have x = x + v t — \gt 2 , containing now two arbitrary con- 
stants, the initial position and initial velocity. The presence 
of such arbitrary constants is the most characteristic feature of 
the solutions of differential equations. And we note that the 
number of arbitrary constants equals the number of integrations 
we must perform to get rid of the differentiations. If the dif- 
ferential equation is one of the first order (with only first deriva- 
tives in it), there will be one arbitrary constant in the solution; 
if it is of the second order (second derivatives), there will be two, 
and so on. And always the arbitrary constants must be deter- 
mined so as to satisfy certain "initial conditions," such as the 
values of the position and velocity at t = 0. 

9. Falling Body with Viscosity.— With the problem of the 
falling body, the solution has automatically come out as a poly- 
nomial in t, which is simply a power series that breaks off, so 
that there is no need of more complicated methods. But now 
let us take a more difficult case: we assume the body to be falling 


through a viscous medium under the action of gravity. Here 
the force is a sum of two parts: gravity, — mg, % and a frictional 
force depending on velocity. It is found experimentally that 
for small velocities this frictional force, in a viscous medium, is 
proportional to the velocity, with, of course, a negative coeffi- 
cient, since it opposes the motion, changing signs with the velocity. 
Let it be called — kv, k being the coefficient, which depends in a 
complicated way on the shape and size of the body, and is pro- 
portional to the coefficient of viscosity of the fluid. Then we 


dv 7 

m-j = — mg — kv, 


m-j + kv = — mg. (2) 

This is a simple sort of differential equation, in a standard form. 
It is 

1. A linear differential equation. That is, it contains v and its 
derivatives (as v, dv/dt, d 2 v/dt 2 , etc.) only in the first power (in 
dv/dt, kv), or the zero power ( — mg, independent of v), not as 
squares or cubes [as, for example, (dv/dt) 2 }, or products (as v 

2. A differential equation of the first order (containing no 
derivative higher than the first). 

3. An inhomogeneous equation (it contains terms of both the 
first power and the zero power in v and its derivatives, while a 
homogeneous equation contains only terms of tlie same power, 
as all of the first power. That is, if the term — mg were absent, 
the equation would be homogeneous). 

We cannot solve Eq. (2) by direct integration, for if we inte- 
grate with respect to t, one term would be jv dt, which we cannot 
evaluate, since v is an unknown function of time. Thus we must 
proceed differently. Let us assume that v is given by a power 
series in the time* v = A + Ait -\- • • • , and try to determine 
the coefficients. We do this by substituting the series in the 
equation. We have by direct differentiation 

^ = A t + 2A 2 t + 3A 3 * 2 +•••+(» + l)A n+l t n '+•••. 

Then, substituting, we have 

m[Ai + 2A 2 t + 3A s t 2 + • • • + (n + l)A n+l t n + • • • J 

+ k(A + Ait + AJ* +'•••+ Aj n +•••) = —m- 


(A: + ±A + g) + (2A 2 + ±A x )t + • ■ • 

+ \{n + l)A n+l + ^A n \» + • • • = 0. (3) 

This states that a certain power series in t is equal to* zero, for all 
values of t. But the only function of t which is always zero is 
zero itself, and by Taylor's theorem the expansion of zero in 
power series is a series all of whose coefficients are zero. Thus 
Eq. (3) can only be satisfied, for all values of t, if each coefficient 
vanishes : 


2A 2 + -A x = (4) 

(n + 1)A„ +1 + ^-A n = 

Here we have an infinite set of equations to solve for the coeffi- 
cients A. Fortunately they are so arranged that we can solve 
them, getting all A'b in terms of A , if we start with the first and 
work down: 


& Ik. lk/k A , \ 

^ Ao + g ) 


A _ _lh 1 k*/k 

As " 3m A * ~ ~Z\lAm 

A n+1 = - ■* -A. = (-1)*+* ] „ ,—(-Ao + g\ 
(n + 1) m v ' {n + V)\m n \m y J 

And the power series is 

v = A + A x t + • • • 

-^ + (^' + »X- ( + 5=« , -S!^+- : -> (6) 

Thus we have the solution. If we set t = 0, we have v = A , so 
that A is simply the initial velocity, and is the arbitrary constant 



which we meet in the solution. We could compute from our 
series the value of v at any time t, knowing the initial velocity. 

It happens in this case that we can recognize the infinite series 
as representing a familiar function. For we have 

k. , 1 fc 2 1 k s 


e = 1 - 

m 2 ! ra 2 3 ! w 3 

which has close connection with our series, so that we can write 
at once 

v = A + l—Ao + g 

(-Ao + g) 




-(A. + %y* 


k ' 



Fig. 1. — Velocity of damped falling body, with various initial conditions. 

We Can see the physical properties of the solution most clearly 
from the graph in Fig. 1, No matter what the initial velocity 
may have been, the particle finally settles down to motion with a 
constant speed, given by —mg/k. The initial velocity is A , and 
if this is greater than the final velocity, the body slows down; if 
it is less, it speeds up, to attain this final speed. 

10. Particular and General Solutions for Falling Body with 
Viscosity. — It is instructive to notice that we can solve our 


problem in an elementary way. Our equation is mdv/dt + kv = 
— trig. Plainly a particular solution is given by assuming a 
constant velocity. Then dv/dt is zero, so that the equation is 
kv = —rag, or » = —mg/k. But this is not the most general 
solution, for it does not have an arbitrary constant; it represents 
merely the particular case in which the initial velocity happened 
to be just the correct final value, and is unable to describe any 
other initial condition. To get a general solution, we proceed as 
follows: we take the homogeneous equation mdv/dt + kv = 0, 
which we obtain from our inhomogeneous equation by leaving 
out the term — mg. We can easily solve this: writing it dv/v = 
— (k/m)dt, and integrating, we have In v = — (k/m)t + con- 
stant, and taking the exponential, v = constant X e~ (k/m) ', where 
the constant is arbitrary. Then the sum of this general solution 
of the homogeneous equation, and the particular solution — (m/k)g 
of the inhomogeneous equation, is the solution we desire. We 
may prove this easily. For we have 

\ m Jt + fcjw 6 "* / = °- 


showing that the function Ce~ {k/m)t — (m/k)g satisfies the differ- 
ential Eq. (2). 

The procedure we have just used is an illustration of the general 
rule : A general solution of an inhomogeneous equation is obtained 
by adding a particular solution of the inhomogeneous equation, 
and a general solution of the related homogeneous equation. In this 
statement, the terms "particular solution" and "general solu- 
tion" are used in a technical sense: a "particular solution" is 
one which satisfies the differential equation but has no arbitrary 
constants; a "general solution" is one which has its full comple- 
ment of arbitrary constants. The proof of the rule in general is 
carried out just as in our case, adding the particular solution of 
the inhomogeneous equation and a general solution of the homo- 
geneous equation, and showing that the sum satisfies the inhomo- 
geneous equation. One thing should be noted: the properties 
we have been discussing depend entirely on the linear character 


of the differential equation, for it is only with linear functions 
/ that /(xi) + f(x 2 ) = f(xi + x 2 ). 

11. Electric Circuit Containing Resistance and Inductance.— 
The theory of the electrical circuit reminds one in many ways of 
mechanical principles: electric current is analogous to velocity, 
charge to displacement, electromotive force to mechanical force. 
Thus in a circuit containing a resistance, inductance, and con- 
denser, all in series, the current can flow through the circuit, 
piling up in the condenser because it cannot flow through. Let 
q be the charge on one plate of the condenser ( — q being the charge 
on the other), and let i be the current flowing through the circuit 
toward the condenser plate in question, so that the current 
measures just the amount of charge per second flowing onto the 
condenser plate, or i = dq/dt (as v = dx/dt). Now let the 
coefficient of self-induction of the circuit be L, the resistance R, 
the capacity of the condenser C. Then there are three e.m.fs. 
(electromotive forces) acting on the current, in addition to a 
possible external e.m.f. E from a battery: the back e.m.fs. of 

induction, resistance, and capacity. The first is — L-jg the 

electromotive force induced in a circuit when the current changes; 
the second is — Ri, the value familiar from Ohm's law; the third 
i s —q/C, as given by the elementary law of the condenser. 
These are all negative, for they act to oppose the current. Now 
the law of the circuit is that the total e.m.f. acting on the circuit 
is zero: 

-Lf-tt-£ + *-0, 


T 'Jt ^ "' ' C 

L d 4 t + Ri + £=E. (8) 

This is a differential equation. Let us take the special case where 
there is no condenser, so that the equation is Ldi/dt + Ri = E. 
The equation is then exactly analogous to the equation mdv/dt + 
kv = F, which we had for a falling body with viscosity. And we 
see that self-induction is analogous to inertia, resistance to 
viscosity. The analogy is often valuable. 

If now the applied e.m.f. E of the battery is constant, the 
problem can be solved mathematically just as before, and we find 
i = constant X e~ (B/L)t + E/R. The first term is the transient 


effect, of arbitrary size, as we see from the arbitrary constant, 
rapidly dying out as time goes on, while the second is the constant 
value given by Ohm's law, the value to which the current tends 
if we wait long enough. " 


1. Show that the solution v = (A + mp/fc)e -( */ m) ' — .mg/k reduces 
properly to uniformly accelerated motion in the limiting case where the 
viscous resistance vanishes. Illustrate this graphically, showing curves for 
several different k's, and finally for k = 0, all with the same initial velocity. 

2. A raindrop weighs 0.1 gm., and after falling from rest reaches a limiting 
speed of 1,000 cm. per second by the time it reaches the earth. How long 
did it take to reach half its final speed? Nine tenths of its final speed? 
How far did it travel before reaching half its final speed? For how long 
could its velocity be described by the simple law v = —gt to an error of 
1 per cent? 

3. At high velocities, the viscous resistance is proportional to the third 
power of the velocity. Assuming this law, set up the differential equation 
for a particle falling under gravity and acted on by such a viscous drag. 
Solve by power series, obtaining at least four terms in the expansion for v 
as a function of t. Draw graphs of velocity as function of time, and discuss 
the solutions physically. 

4. Using the same law of viscosity as in the preceding problem, but assum- 
ing no gravitational force, solve by direct integration of the differential 
equation for the case of a particle starting with given initial velocity and 
being damped down to rest. Show by Taylor's expansion of this function 
that it agrees with the special case of the power series of the preceding 
problem obtained by letting the gravitational force be zero. 

6. A large coil has a resistance of 0.7 ohm, inductance of 5 henries. Until 
t = 0, no current is flowing in the coil. At that moment, a battery of 5 
volts e.m.f. is connected to it. After 5 sec, the battery is short-circuited 
and the current in the coil allowed to die down. Compute the current as 
function of the time, drawing a curve to represent it. 

6. A coil having L = 10 henries. R = 1 ohm, has no current flowing in 
it until t = 0. Then it has an applied voltage increasing linearly with the 
time, from zero at t = 0, to 1 volt at t = 1 sec. After t = 1, the e.m.f. 
remains equal to 1 volt. By series methods find the current at any time, 
and plot the curve. ' 

7. Suppose we have a rocket, shot off with initial velocity v , and there- 
after losing mass according to the law m = m (l — ct), where m is the mass 
at any time, m the initial mass at t = 0, c is a constant, and where the mass 
lost does not have appreciable velocity after it leaves the rocket. Show 
that on account of the loss of mass the rocket is accelerated, just as if a force 
were acting on a body of constant mass. The rocket is acted on by a viscous 
resisting force in addition. Taking account of these forces, find the differ- 
ential equation for its velocity as a function of time, and integrate the equa- 
tion directly. Now find also the solution for v as a power series in the time. 
Show that the resulting series agrees with that obtained by expanding the 


exact solution. Calculate the limiting ratio of successive terms in the 
power series, as we go out in the series, and from this result obtain the region 
of convergence of the series. Is this result reasonable physically? What 
happens in the exact solution outside the range of convergence? 

8. In a radioactive disintegration, the number of atoms disintegrating per 
second, and turning into atoms of another sort, is simply proportional to the 
total number of radioactive atoms present. Write down the differential 
equation for the number of atoms present at any time, and find its solution. 
Assuming that half the atoms of a sample of radium disintegrate in 1,300 
years, how many would decay in the first year? 

9. If at the same time radium were being produced at a constant rate by 
disintegration of uranium, how would this change the situation in the 
preceding problem? Set up the new differential equation. Assuming 
that we start without any radium, but with pure uranium, find the amount 
of radium as a function of the time. Show that the amount of radium 
approaches an equilibrium amount, which it reaches in time, whether the 
initial amount of radium is greater or less than the equilibrium amount. 

10. Find a series solution for the differential equation m dv/dt + kv = c/t, 
where c is a constant, representing a damped motion under the action of an 
external force which decreases inversely proportionally to the time, the 
series having the form v = on/t + a»/t* + • • • . Show that this series is 
divergent for all values of t. Show that the differential equation is formally 

satisfied by the expression v = er* J ^ ~dt. This solution is convergent for t 

negative. The integral | j dt is known as the exponential integral func- 
tion, and is important in physics and mathematics. It is frequently calcu- 
lated by using the above divergent series. Explain how this procedure might 

be valid. 

11. Suppose a particle is acted on by a damping force proportional to the 
velocity, and to a force which varies sinusoidally with the time. Solve the 
resulting differential equation for velocity as function of time, by the series 
method, by expanding the force in power series in the time. Can you 
recognize the analytical form of the resulting power series? 

d 2 v 1 dtf 

12. Solve by power series Bessel's equation ^ + - ^ + y =0. The 

result is Bessel's function of the zero order, Jo 0c). From the series, plot 
J (x) for x between and 5. 

13. The equation for Bessel's function of the mth order, J m {x), is ^ + 

i d y. + (\ _ — 2 V = 0. Solve by power series, showing that the first 
x dx \ x 2 J 

term in the expansion is that in x m . Plot Ji(x) for x between and 5. 
Bessel's functions oscillate, like the sine and cosine, all the way to infinity. 
We shall use them in discussing standing waves in a circular membrane, and 
for many other problems. The second independent solution of the equation 
is infinite at the origin, and hence cannot be expanded in power series. 



In the last chapter we have found a general method of power 
series for solving differential equations, and have applied it to 
the problem of motion under viscous forces. Next we consider 
the same method, applied to somewhat different problems: a 
particle acted on by restoring forces proportional to the distance, 
or an electric circuit containing inductance and capacity. 

12. Particle with Linear Restoring Force. — Suppose that the 
force acting on a particle is proportional to the displacement 
from a fixed position, and opposite to the displacement, a so-called 
linear restoring force. This force is -kx, if x is the displace- 
ment, k a constant. For the moment we assume that there is 
no gravitational or other external force acting. Then the equa- 
tion of motion is md 2 x/dt 2 = — kx, or 

m dfi + kx = (1) 

This is a homogeneous linear differential equation of the second 
order, with constant coefficients (that is, m, k are independent 
of time). We solve it in series as before. If x = A + Ait + 
A 2 t 2 + • • • , we have immediately, by the method used before, 

(2mA 2 + kAo) + (3 • 2mA 3 + kA x )t + (4 • SmA* + kA 2 )t 2 + 

• • • =0. 

Thus, setting the separate coefficients equal to zero, and solving 
one equation after the other, we find 

a 1Jc A a 1 * A 

A2 = -2m Ao > As = -3lm Al > 


These equations determine all the coefficients in terms of two 
arbitrary ones, A and Ai, which are the two arbitrary constants 



to be expected in the solution of a second-order differential 
equation. The solution may be written 

x = A<_ 
+ A 

x-^ + imV- 

' Z\m 

4 !\w/ 


We now observe that these series represent well-known functions: 
the first is the cosine, the second the sine, except for a factor, so 
that we have 

x = A cos \/hJm t + AtVm/k sin \/k/m t. (4) 

Thus the motion is a periodic one, as shown by the sinusoidal 
functions. The period T is found from the fact that when t 
increases by T, the sine or cosine must come back to its initial 
value, which it does when its argument (that is, the thing whose 
cosine we are taking), increases by 2tt. Thus Vk/m T = 2t, 
T = 2TrVWk, the familiar formula for the period in simple 
harmonic motion. From this, the frequency v is given by 
v = l/T = (l/27r)\/fcM and tne angular velocity co by co = 
2ttv = y/k/m. It is often convenient to use these relations in 
rewriting the equation of motion, writing it 

d*x/dt* + co 2 z = 0, or d 2 x/dt 2 + 47rVx = 0. (5) 

13. Oscillating Electric Circuit.— In the last chapter, we have 
seen that the equation for an electric circuit containing resistance, 
inductance, and capacity, is L di/dt + Ri + q/C = E, where i 
is the current, q the charge on the condenser, and E the impressed 
electromotive force. We also saw that i = dq/dt. Substituting, 
we obtain 

4' + 4+§ = * (6) 

This is an inhomogeneous second-order linear differential equation 
for q, which becomes homogeneous if E = 0. We consider that 
case, and in particular let R be zero. Then the problem becomes 
mathematically equivalent to the preceding one, and has the 
differential equation d*q/dt* +j/LC = 0. The solution is 
q = A cos VYJLC t + AiVLC sin Vl/LC t, so that the 
current oscillates in the circuit. By differentiating, we can find 
the current directly instead of the charge: i = dq/dt = 


-A Vl/LC sin Vl/LC t + A x cos Vl/LC t, so that the 
oscillations of charge and current are similar. The period of 
oscillation is given by T = 2w\/LC, increasing as either the 
inductance or the capacity becomes large. 

14. The Exponential Method of Solution. — We have found 
that the solutions of our vibration problems, as well as of several 
other differential equations, come out either as exponential 
functions, or as sines or cosines. As a matter of fact, any 
homogeneous linear differential equation, with constant coeffi- 
cients, has such solutions. On account of the importance of 
this type of equation, we shall consider its solution specially. 
Let us take a second-order differential equation, 

g + «g+*-«. • (7) 

a type which includes the mechanical and electrical problems 
we have worked with. We can show very easily that this has 
an exponential solution, y = e kx . For let us substitute this 
function into the equation. We have dy/dx = ky, d 2 y/dx 2 = 
k 2 y, so that the equation becomes (k 2 + ak + b)e kx = 0. This 
equation is factored, and since e kx is not always zero, the other 
factor must be, and we have k 2 + ak + 6 = 0, or solving the 
quadratic by formula, k = -a/2 ± Via/2) 2 - &. Thus if k 
eq uals either k x = -a/2 + V(«/2) 2 - b, or k 2 = —a/2 — 
V(a/ 2 ) 2 — b, e kx is a solution of the equation. We have, in 
fact, two independent solutions. 

Now if we have two independent solutions of a second-order 
linear homogeneous differential equation, we can readily show that 
any linear combination of them is itself a solution. If such a 
solution has two arbitrary constants, it is a general solution. Thus 
we can write the general solution of Eq. (7) 

y = Ae klX + Be k * x , 

y = e -(a/2)s[_4 e v / («/2)*-&s _|_ £ e -V(«/2) *-&*]. (8) 

This is the solution, with its two arbitrary constants, and it 
might seem as if no further discussion were necessary. But 
there is an interesting feature still to consider: the quantity 
(a/2) 2 — 6 under the radical may easily be negative, and the 
square root imaginary, so that we have to investigate the 
exponentials of imaginary quantities. 


Suppose, for example, that the damping term is zero: a = 0, 
and the differential equation is d 2 y/dx 2 + by = 0. This is the 
only case we have so far worked out in detail. Then the solution 
becomes y = Ae*^ 1 * + Be'*^ 1 *, where i = yf^l. But we 
have already seen that the solution of this same equation is C 
cos y/bx + D sin y/bx. If both forms are right, there must 
be connections between exponential and sinusoidal functions, 
which we now proceed to investigate. 

15. Complex Exponentials. — Let us investigate the function 
e ix by series methods. We have at once 

e ,x - 1 + tx - 2] _ Iff "+~ 41 * " 

-( l -*+t---) + i ('-a + --;) 


e ix = cos x -\- i sin x. 

Similarly we have 

e -ix _ cos x — i sin x. (9) 

We can solve for cos x by adding these equations and dividing 
by 2, or for sin x by subtracting and dividing by 2r. 

e ix _|_ e -ix ^ e ix _ g-XX 

cos a; = H » sin a; = ^ ( 10 ) 

These theorems are fundamental in the study of exponential 
and sinusoidal functions. 

In terms of the formulas of the last paragraph, we can readily 
see that our two formulations of simple harmonic motion are 
both correct. For we have 

= A (cos y/bx + i sin y/bx) + B (cos y/bx — i sin y/bx) 
= (A + B) cos y/bx + i(A — B) sin y/bx, 

or one constant times the cosine plus another times the sine, 
which is the more familiar solution. By giving A and B suitable 
complex values, we can have both coefficients real. But to 
know how to do this, and to understand the whole process, we 
should study complex numbers for themselves. Let us then 
make a little survey of the theory of complex numbers. 



/ / 

&/ / 

x / 



16. Complex Numbers. — A complex numb er is usually written 
A + Bi, where A and B are real, i = s/ — 1. It is often plotted 
in a diagram: we let abscissas represent real parts of numbers, 
ordinates the imaginary parts, so that A measures the abscissa, 
B the ordinate, of the point representing A + Bi. Every point 
in the plane corresponds to a complex number, and vice versa. 
All real numbers lie along the axis of abscissas, all pure imagi- 
naries along the axis of ordinates, and the other complex numbers 
between. But it is also often convenient to think of a complex 
number as being represented, not 
merely by a point, but by the 
vector from the origin out to the 
point. The fundamental reason 
for this is that these vectors obey 
the parallelogram law of addition, 
just as force or velocity vectors do 
(see Fig. 2). The vector treat- 
ment is suggestive in many ways. 
For example, we can consider the 
angle between two Complex num- 
bers. Thus, any real number, and 
any pure imaginary number, are 
at an angle of 90 deg. to each 
other. Or, the number 1 -+■ i is at 
an angle of 45 deg. with either 1 
or i. When a complex number is regarded as a vector, we can 
describe it by two quantities: the absolute magnitude of the 
vector, or its length, \/A 2 + B 2 ; and the angle which it makes 
with the real axis, or tan -1 B/A. 

The vector representation of complex numbers has very close 
connection with complex exponential functions. Let us consider 
the complex number e ie , where is a real quantity. As we have 
seen, this equals cos + i sin 0, so that the real part is cos 0, 
the imaginary part sin 0. The vector representing this number 
is then a vector of unit magnitude, for V cos 2 + sin 2 = 1. 
Further, it makes just the angle with the real axis. We cau 
see interesting special cases. The number e™' 2 = i, as we can 
see at once from the vector diagram, or from the fact that it 

3 C A E 

Fig. 2. — Law of addition of com- 
plex vectors. The vector E + Fi 
represents the vector sum of A + Bi 
and C + Di. Evidently OE = 
OA + AE = OA + OC, and OF = 
OB + OD. Hence E + Fi = 
(A + O + (B + D)i. 

,2-ri — 

equals cos t/2 + i sin x/2 = i. Similarly e" = — 1, e* 

• = 1. This Jast result shows that the exponential 

e 4*i = 



function of an imaginary argument is periodic with period 2iri, 
similarly to the sine and cosine of a real argument. 

Next we look at the number re i0 , where r, 6 are both real. It 
differs from e ie in that both real and imaginary parts are multi- 
plied by the same real factor r, which simply increases the length 
of the vector to r, without changing the angle. Thus re ie is a 
vector of length r, angle 0. As a result, we can easily write any 
complex number in complex exponential form: A + Bi = re ie , 

where r = VA 2 + B 2 , 6 = 
tsar 1 B/A, or A = r cos 6,B = 
r sin (see Fig. 3). We may 
use these results in showing 
what happens when two com- 
plex numbers are multiplied 
together. Suppose we wish to 
form the product (A + Bi) 
(C -f Di) . Of course, multiply- 
ing directly, this equals (AC — 
BD) + (AD + BC)i, so that we 
can easily find real and imagi- 
nary parts of the product, but 
this is not very informing. It 
is better to write A -J- Bi = 
r x e ie \ C + Di = r 2 e ie \ Then 
the product is (rxe iei )(r%e i6i ) = 
(ri/- 2 )e i(01+e2) . That is, the mag- 
nitude of the product of two 
complex numbers is the product 
of the magnitudes, and the 
angle is the sum of their angles. 
Suppose we have a complex number re ie , and consider the 
closely related number re~ ie . The second is called the conjugate 
of the first. If we have a complex number in the form A -f Bi, 
its conjugate is A — Bi. Or in general, if we change the sign 
of i wherever it appears in a complex number, we obtain its 
conjugate. Graphically, the vector representing the conjugate 
of a number is the mirror image of the vector representing the 
number itself, in the axis of real numbers. Now conjugate 
numbers have two important properties: the sum of a number 
and its conjugate is real (for the imaginary parts just cancel in 
taking this sum), and the product is real (for this equals 


A=rcos0 A 

-The complex number P equals 
either A + Bi, or re %e . 


r 2 e i(e-6) = r 2) The second fact is useful in finding the absolute 
magnitude of a complex number: if z is complex, z its conjugate 
(this is the usual notation), then -y/zi equals the absolute 
magnitude of z. From the other fact, we may find the real and 

i - % -I— 5 

imaginary parts of complex numbers: — ^ — equals the real part 

z — z 
of z, and as we can easily show, ~. equals the imaginary 

part. We see examples in our relations between sinusoidal and 
exponential functions where e~ ix is the conjugate of e ix , so that 

gix _|_ a — ix- 

2 should, and does, equal the real part of e ix , or cos x, 

oix ff~i- x 

and ~-. equals the imaginary part, or sin x. 

17. Application of Complex Numbers to Vibration Problems. — 

There are two different, though related, ways of applying com- 
plex numbers to vibration problems. The first, and perhaps 
more logical, is directly suggested by what we have done. We 
found for undamped vibrations that y = Ae iy /~ hx + Be~ i ^ x . 
Now naturally we wish y to be real, since it represents a real 
displacement. To do this, we make use of the proposition that 
we have just found, that the sum of a complex number and its 
conjugate is real. Since e^V*"* i s the conjugate of e 1 ^", we 
achieve the desired result if we make B = A, for then the whole 
second term is just the conjugate of the first. Incidentally, 

if we write A = -= e~ ia , we have 

y = - e i(Vb*-cc) _|_ ^(Vfrs-oO = C cog (y/b x _ a)> (U) 

giving a form, in terms of amplitude C and phase a, which is 
often useful and important. 

The second method of treatment is more common, particularly 
in electrical applications. Suppose we work directly with the 
complex solution y = Ae i ^ /hx , but consider that only the real 
part is of physical significance. This real part, as we have 
seen, is half the sum of this quantity and its conjugate, so that, 
except for a factor of 2, it comes to the same thing we have 
considered before. However, it is often easier to think of it in 
this way, and the process of using a complex solution, and finally 
taking the real part, is very common. Of course, if A is real, 


the real part is simply A cos y/bx; if A is complex, we may write 
it Ce - **, and the real part of the product is C cos {\/bx — a). 
This second method is particularly interesting in discussing simple 
harmonic motion, where x is replaced by t, and y/b by to, so 
that we are considering the real part of ■ Ae™*. The complex 
number is given by a vector of length A, rotating in the complex 
plane with angular velocity w. And its real part is simply the 
projection of the vector along the real axis. Thus it corresponds 
exactly to the most elementary formulation of simple harmonic 
motion, as the projection of a circular motion on a diameter. 


1. Show directly that the solution A sin at + B cos at for the particle 
moving with simple harmonic motion can also be written C cos (at - a). 
Find C and a. as functions of A and B, and vice versa. The constant C is 
called the amplitude of the motion, and a is called the phase. Note that a 
can be regarded as an angle, measured in radians. 

2. A pendulum 1 m. long is held at an angle of 1 deg. to the vertical, 
and released with an initial velocity of 5 cm. per second toward the position 
of equilibrium. Find amplitude and phase of the resulting motion. 

3. A circuit contains resistance, inductance, and capacity, but there is no 
impressed e.m.f. Solve the differential equation in series, and show by 
comparison of the first few terms that the series represents the function 
e -iR/iDt(A s i n w t + B cos at), where a 2 = 1/LC — R 2 /4JL 2 . 

4. In an oscillatory circuit, show that the phases of the charge and the 
current differ by 90 deg. 

5. Given a complex number' represented by a vector, what is the nature 
of the vector representing its square root; its cube root? Find the three 
cube roots of unity, the four fourth roots, the five fifth roots, plotting them 
in the complex plane, and giving real and imaginary components of each. 
With one of the cube roots, in terms of its real and imaginary parts, cube by 
direct multiplication and show that the result is unity. 

6. Find real and imaginary parts of V3~+5t, jT+lBi y/ A + Bi ^^ 

A, B are real. . 

7. Show that In (-a) = id + In a, or &rt + In a, or m general nm + 
In a, where n is an odd integer. 

8. Prove that if we have a complex solution of the problem of a vibrating 
particle, the real part of this complex function is itself a solution of the 

problem. , 

9. Show that in general a linear homogeneous differential equation of the 
nth order with constant coefficients has n independent exponential solutions 
of the sort we have considered. 

. 10. Show that if we have n independent solutions of an nth order differen- 
tial equation, then an arbitrary linear combination of these solutions, con- 
taining n coefficients, is a general solution of the equation. 




We have now reached the point where we can discuss a wide 
range of prbblems in oscillatory mechanical or electrical systems. 
The general question we shall take up is that of a system con- 
taining inertia, damping force proportional to the velocity, and 
restoring f oroe proportional to the displacement, under the 
action of an impressed force. This leads to an inhomogeneous 
second-order linear differential equation, of the form 

m|f + 2mk < ^ + m^x = F(t), (1) 

where the coefficients 2mk and wco 2 of the damping and restoring 
force terms, respectively, are written in this way to obtain a 
simple result. The term F(t) , which makes the equation inhomo- 
geneous, is the impressed force, a function of time. The solution 
of such an inhomogeneous equation, as we have seen, can be 
written as a sum of two parts. One is a particular solution 
of the problem, the so-called forced motion, a steady-state 
solution which persists as long as the force is applied. The other 
is the transient term, a general solution of the corresponding 
homogeneous equation obtained by setting F = 0. This 
transient proves to be a damped simple harmonic motion, an 
oscillation whose amplitude decreases exponentially with time, 
soon passing away, and leaving only the steady-state solution. 
The amplitude and phase of the transient are determined so 
that the whole motion will have the correct initial displacement 
and velocity, its two arbitrary constants being chosen to fit 
the initial conditions. 

18. Damped Vibrational Motion. — We first consider the 
transient motion, whose equation is obtained from (1) above 
by setting F = 0. In the preceding chapter we have seen that 
the solution can be written 

x = e-^iAe^*^ 1 + Be-^^ 1 ). (2) 

There are three cases: (1) k 2 - « 2 < 0; (2) k 2 - « 2 = 0; (3) k 2 - 



« 2 > 0. The first is the case where the damping is small. Here 
y/k 2 — o> 2 = i\/oi 2 — k 2 , and the radical is real. Then we have 
the same sort of expression we have considered before, and to 
get a real answer we must write B = A, or else we can take the 
real part of a complex quantity. Let us do the latter: the solu- 
tion is the real part of Ae^e 1 *^" 2 -^ 1 , or is Ce~ kt cos (a/« 2 — k 2 t — 
a). This is like a simple harmonic motion, of angular velocity 
\Ao 2 — A; 2 , phase a, but with an amplitude Ce~ kt which continually 
decreases with time, and it is called damped simple harmonic 
motion. For snjall damping, the angular velocity can be 

k 2 
expanded in power series, and is <o — ^- • • • , differing from <a 

by a small quantity of the second order. Thus, for example, 
a pendulum which is slightly damped will have its period only 
very slightly altered by the damping. The amplitudes of 
successive swings go down in exponential fashion, on account of the 
factor e~~ kt . Thus the logarithms of the amplitudes go down 
linearly with the time, and as a result this kind of damping is 
known as logarithmic damping. The decrease in the logarithm of 
the amplitude in a period is known as the logarithmic decrement. 

The other extreme case is the third, where k 2 — w 2 > 0, and 
there is nothing complex about the solution at all. It simply 
consists of two exponential terms, with only real coefficients. 
The resulting motion is not oscillatory, but merely damps down 
gradually to zero. The limiting case, k 2 — « 2 = 0, is called 
the critical case, and is most easily discussed as the limit of 
either of the others. An interesting practical application of 
all the cases is found in the problem of the vibrations of galvanom- 
eters. A galvanometer without damping oscillates back and 
forth with simple harmonic motion. With slight damping, it 
has nearly the same frequency, but a logarithmic decrement. 
As the damping is made greater and greater, the period gets 
larger and larger, until finally at critical damping and beyond 
there are no oscillations at all. The galvanometer, if displaced, 
simply settles slowly back to its normal position. 

19. Damped Electrical Oscillations. — The corresponding 
electrical problem is given by the circuit containing resistance, 
inductance, and capacity, and the equation is 


The solution is 

q = c e -(*/2i>< C o S ( u t - a), (4) 


co = Vl/LC - R 2 /AL 2 . 

This is the same solution which we found in Prob. 3 of the last 
chapter by the series method. It is an interesting illustration 
of the simplicity of the exponential method of solving the equa- 
tion. As we see, the current oscillates with an angular velocity 
which, for small R, differs only slightly from the undamped 
angular velocity \/l/LC, but it has a logarithmic damping, 
which is greater the greater R is. 

20. Initial Conditions for Transients. — To fix the two arbitrary 
constants of the transient, we must fit the initial displacement 
and velocity. Thus, for instance, consider the solution in the 

x = Ce~ kt cos (Vw 2 - k 2 1 - a). 

Assume that at t = 0, x = z n , and dx/dt = v . From the first, 

Xo = C cos a. (5) 

To apply the second, we have 

^ = -Ce- kt V<» 2 - k 2 sin (\A> 2 - k 2 1 - a) 

-kCe~ kt cos (V« 2 - k 2 t - a). 

Vo = Cy/o> 2 — k 2 sin a — kC cos a. (6) 

By simultaneous solution of Eqs. (5) and (6) we can find C 
and a in terms of x and v . 

21. Forced Vibrations and Resonance. — Our next task is to 
find a particular solution of Eq. (1) containing the external 
applied force. To do this, we shall first solve the case where 
the force is a sinusoidal function of the time, a very important 
special case. This leads to* a solution also sinusoidal with the 
same frequency, with an amplitude proportional to the amplitude 
of the force, but for which the constant of proportionality depends 
on the frequency, becoming large out of all proportion if the 
impressed frequency is nearly equal to the natural frequency. 
This phenomenon of enormously exaggerated response of the 


oscillating system to a certain impressed frequency is called 
resonance; it is of great physical importance. 

Familiar examples of resonance will occur to one. In 
mechanics, it is well known that a pendulum can be set swing- 
ing with large oscillations if it receives small periodic impulses, 
timed to synchronize with its own period, whereas any other 
impressed frequency would soon get out of step with the oscil- 
lations it sets up, and would force them to die down again. 
Acoustical resonance is illustrated by the way in which one 
vibrating tuning fork will set another into vibration if both have 
the same pitch, but not otherwise. Another acoustical example 
comes from Helmholtz's resonators: air chambers vibrating 
with a definite pitch, which are set into resonant vibration if 
sound of that particular pitch falls on them, but not appreciably 
by any other pitch, so that they can be used to pick out a particu- 
lar note in a complicated sound and estimate its intensity. 
The resonance of electric circuits is illustrated in the tuned 
circuits of the radio, which respond only to sending stations 
of a particular wave length, and practically not at all to other 
stations. In optics, the theory of refractive index and absorp- 
tion coefficient is closely connected with resonance. As is 
shown by the sharp spectrum lines, atoms contain oscillators 
capable of damped simple harmonic motion, or at any rate act 
as if they did; the real theory, using wave mechanics, is com- 
plicated but leads essentially to this result. An external light 
wave is a sinusoidal impressed force, leading to a forced motion 
of the oscillators with the same frequency but different phase. 
The component of motion in phase with the field reacts back on 
the field to change its phase, and this progressive change of 
phase as the light travels through the body is interpreted as a 
changed velocity of propagation, or an index of refraction differ- 
ent from unity. Similarly the other component produces a 
diminution of intensity, or absorption. The phenomenon of 
anomalous dispersion, with abnormally large index of refraction 
and absorption coefficient, comes about when the external wave 
is in resonance with the atom. 

22. Mechanical Resonance. — Let the external force be F 
cos cot. It is simpler to regard this as being the real part of 
F e™ 1 . Thus we use the differential equation 

m ~dfi + 2mk 'dt + mo> ° 2x = F ° eia "> W 


where we use co for the natural angular frequency, to distinguish 
from the impressed angular velocity co. The resulting x will be 
complex, and its real part represents the actual motion. Now we 
assume that the forced motion has the same frequency as the 
impressed force, or that x = Ae iwt , where A may be complex. 
If A r and At are the real and imaginary parts of A, we easily see 
that the real part of x is given by 

A r cos (at — Ai sin at, (8) 

so that in general* the motion has one term in phase with the 
force, whose amplitude is given by the real part of A, and another 
out of phase, the amplitude being the negative of the imaginary 
part. Substituting our exponential formula for x in Eq. (7), we 

[ra( — co 2 ) + 2mk(io)) + m(a Q 2 ]Ae iwt = Foe 1 "'. 

Canceling the exponential, we have 

A =ll I (Q\ 

m (o> 2 - co 2 ) + 2ika' W 

To get the coefficients of terms in phase and out of phase with 
the force, or A r and — Ai, we multiply numerator and denomina- 
tor by the conjugate of the denominator, obtaining respectively 

A r = '- 

COq 2 — CO 2 

to (co 2 - co 2 ) 2 + 4A; 2 co 2 

_ a __Fq 2/bco 

1 to (coo 2 - co 2 ) 2 + 4A; 2 co 2 " UUj 

These two functions are plotted in Fig. 4. It is seen that the 
first has the form made familiar by the anomalous dispersion 
curve in optics, the second resembling the corresponding absorp- 
tion curve. This resemblance is an essential one, as we shall 
see in Chap. XXIV. One feature of the curves should be 
mentioned. The anomalous behavior in the neighborhood of 
coo is confined to a narrower and narrower band of frequencies 
as & becomes smaller and smaller compared with co , so that if 
the damping is very small the resonance is very sharp, while if 
there is large damping, there is a broad range of frequencies over 
which resonance is appreciable. 

23. Electrical Resonance. — Suppose that a dynamo supplies 
sinusoidally alternating electromotive force, given by E cos wt, 



to an electric circuit containing resistance, inductance, and 
capacity. The differential equation for the charge is then 

d 2 q jjdq 

df 2 

+ R m + i 

E COS wt. 


We set up instead the differencial equation for the current i = 

Fig 4. — Amplitude of forced motion of an oscillator, as function of frequency. 
(a) Component in phase with force; (6) component out of phase. 

dq/dt, which we obtain from Eq. (11) by differentiating with 
respect to time : 

T dH , ^di , i d /T1 ,. 


As with the mechanical case, we replace E cos cot by the complex 
exponential Ee iut , of which the real part gives the electromotive 
force. Similarly we assume the current to be sinusoidal, given 
by the real part of i e iut . Making these changes in Eq. (12), 
and carrying out the differentiations, we have 




= ioiEe 1 

^o = 

R + i(U> - 1/Cw) 
The denominator here equals Ze ia , where 

Z = V# 2 + X 2 , X = Lo 





a — tan" 



where X is called the reactance, Z the impedance. Then the 
current is 


i = y COS (ait — a). (15) 

The impedance takes the place of the resistance in problems 
involving alternating currents, since we divide the amplitude of 
the e.m.f. by the impedance rather than by the resistance to get 
the amplitude of the current. We note that the impedance is a 
function of frequency. It becomes infinite when the frequency 
becomes zero, on account of the term involving the capacity, 
and showing that a direct current cannot go through a con- 
denser; and also when the frequency becomes infinite, on account 
of the term in the inductance, showing that infinitely rapid 
oscillations cannot pass through the inductance. In between, it 
goes through a minimum, at the frequency for which X = 0, 
or o) = 1/y/LC, the natural frequency at which the circuit 
would oscillate by itself if it had no resistance or impressed 
e.m.f. Thus for impressed e.m.fs. of the same amplitude, but 
of a variety of frequencies, that whose frequency agrees most 
closely with the natural frequency will produce the largest cur- 
rent, and the others may produce much smaller currents, so that 
we have resonance, or tuning. To tune a circuit, one adjusts 
L or C, or both. When it is tuned, the sharpness of tuning 
depends on the size of R. For instance, if R were 0, there would 
be infinite response at exact resonance, so that the tuning would 
be infinitely sharp. 

In addition to the dependence of amplitude on frequency, 
there is also a phase difference between e.m.f. and current, given 
by the quantity a above. We can get a simple interpretation 
of this in the complex plane. The quantity R + iX is called 
the complex impedance. Its magnitude is just the real imped- 
ance Z, and its phase, or angle, is the angle a. It is interesting 
to note that a goes from —90 deg. at zero frequency to +90 deg. 
at infinite frequency, passing through zero at resonance. 

24. Superposition of Transient and Forced Motion. — The 
general solution of an oscillatory problem is the sum of the steady- 
state motion (the particular solution), and a transient with 
arbitrary amplitude and phase, chosen to satisfy the initial con- 
ditions. Thus, choosing an electrical case, we may have no 
charge and current in a circuit at t = 0, but start applying a 


sinusoidal e.m.f. at that instant. The charge and current at 
any later time are given by 

-—* E 

q = Ae 2L cos (a t — a ) +"— y sm M — <*)> 

--«' R ---' 

i = — Aa Q e 2L sin (a Q t — a ) — -^orf cos ( w <>£ — a ) 

-f- -y COS (<o£ — a), 

where co is the natural angular frequency, A and a the amplitude 
and phase of the transient. Then to determine A and a we have 
the equations 


— q = A cos a ^ sin a 

= i . = Aa>o sin a — A^y cos a + -g cos, a, (16) 

where g , *o are initial charge and current, equal to zero for these 
particular initial conditions. 

Three examples of the charge as a function of time are given 
in Fig. 5. In (a), the natural frequency is taken to be much 
greater than the external frequency, and the logarithmic decre- 
ment large, so that the transient is a rapidly damped high 
frequency vibration, which is imperceptible after a few periods 
of the external force. The case (6) is that in which external and 
natural frequencies are almost equal, and the damping small. 
In this case, the forced and transient vibrations, having almost 
the same frequencies, form beats with each other, as one always 
has when two almost equal frequencies are superposed, the sum 
of two sine waves leading to a sinusoidal vibration whose fre- 
quency is the average of the two frequencies, but whose amplitude 
is modulated with the slow difference frequency between the two 
vibrations, as given by the equation 

cos ait + cos a4 = 2 cos ( - — 2 — f cos I 2 — / ^ ' 

Since the transient gradually dies down, however, the amplitude 
of the beats grows less and less, until gradually only the forced 
motion remains. In the case (c), the external frequency is 
exactly equal to the natural frequency. Here there are no beats, 


the amplitude merely building up exponentially to its final 

Curve A is forced motion, B transient, C combined motion, 
(a) Natural frequency high, impressed frequency low, large damping. 

(6) Impressed and natural frequency approximately equal. 

(c) Impressed and natural frequency equal. 
Fig. 5.- — Transient and forced motion superposed. 

25. Motion under General External Forces. — If we are given 
an arbitrary external force, say F(t), we shall show in a later 
chapter that it is possible to write it as a sum of sinusoidal terms : 

F(t) = real part of Vf,^*' 


Thus any sound may be considered as made up of a superposition 
of pure tones, and any light as a superposition of pure colors. 
Now suppose we find the forced motion resulting from each 
of these sinusoidal vibrations acting separately, and then add 
them. The result will be the solution of the whole problem. 
For suppose x n (t) is the solution of the problem whose force is 
the nth term of the summation, so that we have 

d 2 d \ 

Add all these equations. Then we have 

n n 

showing that ]£\r„ satisfies the whole equation. We readily 

n ^ 

see that this is a special case of a general theorem : if the impressed 
force, in an inhomogeneous linear equation, is written as a sum 
of terms, and if we have solutions of the separate problems in 
which only one term of the sum is impressed at a time, the solu- 
tion of the whole problem is the sum of these separate solutions. 
We note that those particular forces whose frequencies are near 
the natural frequency will produce greatly exaggerated responses. 
26. Generalizations Regarding Linear Differential Equations. 
We have made several generalizations regarding linear differential 
equations, and it is well to group these together. We have seen 

1. Any linear combination of solutions of a homogeneous linear 
differential equation is itself a solution, and if the linear combina- 
tion contains as many arbitrary constants as the order of the 
differential equation, it is a general solution. 

2. A general solution of an inhomogeneous linear differential 
equation is the sum of a particular solution, and a general solu- 
tion of the corresponding homogeneous equation. 

3. If the inhomogeneous part of an inhomogeneous linear 
differential equation is a sum of terms, and if we have the solu- 
tions of the equations formed by taking just one of these sepa- 
rately, the particular solution of the whole problem can be formed 
by adding these separate solutions. 

Physically, the first statement means that free vibrations of a 
system governed by a linear differential equation may be super- 


posed without affecting each other. The second means that 
free vibrations can coexist with forced vibrations; and the last, 
that forced vibrations from different sources can coexist without 
affecting each other. All these properties of coexistence or 
superposability of vibrations are characteristic only of linear 
equations, but, as we shall see, a great many physical phenomena 
are governed by such equations, so that the superposability of 
vibrations is of widespread physical importance. 


1. A coil of resistance 2 ohms, inductance 10 millihenries, is connected to a 
condenser of capacity 10 mf. At t = 0, the condenser is charged to a 
potential of 100 volts, and no current is flowing. Find the charge on the 
condenser at any later time, and also the current flowing. What are the 
period and logarithmic decrement of the circuit? What would the resist- 
ance have to be, leaving inductance and capacity the same, such that the 
system would be critically damped? 

2. Prove that the displacement of a particle in damped oscillation is given 

x = e- kt (x cos V" 2 -kH + Vo . + kxo sin -\A> 2 - ** t), 
\ Vw 2 - fc 2 / 

where xo, vo are initial values of displacement and velocity. Pass to the 
case of critical damping, by letting w 2 — k 2 approach zero. Show that the 
resulting motion has one term of the form te~ kt , and prove directly that this 
satisfies the differential equation. 

3. Letting w = k/2, draw curves for x as a function of t, representing the 
damped motion for the case where the initial velocity is zero but the initial 
displacement is not, and also for the case where the initial displacement is 
zero but the velocity is not. 

4. A pendulum is damped so that its amplitude falls to half its value in 
1 min. Its actual period is 2 sec. Find the change in the period which 
there would be if the damping were not present. (Hint: use power series 
expansion for frequency, treating A; as a small quantity.) 

5. A radio receiving station has a circuit tuned to a wave length of 500 m. 
It is desired to have the tuning sharp enough so that a frequency differing 
from this by 10,000 cycles per second gives only 1 per cent as much response 
as the natural frequency, for the same amplitude of signal. Work out 
reasonable values of resistance, inductance, and capacity to accomplish this. 

6. The sharpness of tuning of a vibrating system may be measured by 
the so-called half breadth of the resonance band, or the frequency difference 
between the two frequencies for which the amplitude of response is half 
that at exact resonance. Prove that the ratio of half breadth to resonance 
frequency is proportional to the logarithmic decrement, if the damping is 
not too great. 

7. A tuning fork of pitch C (256 vibrations per second) is so slightly 
damped that its amplitude after 10 sec. is 10 per cent of the original ampli- 
tude. It is set into oscillation, first by another fork of the same pitch, thee 


by one a semitone higher, both vibrating with the same amplitude. Find 
the ratio of amplitudes of forced motion in the two cases. What will be 
the pitch of the forced vibration in the second case? 

8. The support of a simple pendulum moves horizontally back and forth 
with simple harmonic motion. Show that this sets the pendulum into forced 
motion, as if there were a force applied directly to the bob. Show that the 
motion has the following behavior: The pendulum pivots about a point 
not its point of support, but such that, if it were really pivoted here, its 
natural period would be the actual period of the forced motion. Discuss 
the cases where the pivotal point is below the point of support; above the 
point of support. Neglect transients. 

9. A particle subject to a linear restoring force and a viscous damping is 
acted on by a periodic force whose frequency differs from the natural fre- 
quency by a small quantity. The particle starts from rest at t = 0, and 
builds up the motion. Discuss the whole problem, including initial condi- 
tions. Consider what happens in the limiting case when the frequency 
gets nearer and nearer the natural frequency, and the damping gets smaller 
and smaller. Show that the results are as indicated in Fig. 5, (6), (c). 

10. The amplitude of the forced current in a circuit is 

. _ E 

u ~ [R + i(L<* - 1/C«)]' 

Plot real part as abscissa, imaginary part as ordinate, obtaining a curve by 
taking points for all frequencies. Find the equation of the resulting curve, 
and prove that it is a circle. 

11. Show that for a particle subject to a linear restoring force and viscous 
damping the maximum amplitude occurs when the applied frequency is 
less than the natural frequency. Find this resonance frequency. Show 
that maximum energy is attained when the applied frequency equals the 
natural frequency. What are the maximum amplitude and maximum 

12. The motion of an anharmonic undamped oscillator is described by 

m Jtz ~*~ mo> ° 2x ~l~ ^ xi = ®' 

where 6 is a small quantity. Solve this equation by successive approxima- 
tions, expanding x in a power series in powers of b. 

13. If the oscillator in Problem 12 is acted on by a force A cos pt + B cos 
qt, show that the steady-state solution contains terms of frequencies 2p, 
2?, q + P, q — P, 2g + p, 2q — p, etc. Note that superposition does not 
hold for the equation above. These new frequencies are called combination 


We have progressed far enough in our study of mechanics so 
that it will pay to stop and survey the situation. Mechanics is 
a large subject, and we may consider some Of the directions in 
which we could extend what we have done already. In the first 
place, we may treat the mechanics of many sorts of systems. We 
may have the mechanics of particles, or of rigid bodies, or of 
deformable, elastic solids, or of fluid media. All these we shall 
treat, in more or less detail, before we are through. What we 
have done so far comes under the heading of mechanics of parti- 
cles, and we shall look at that field in more detail. 

In the first place, one almost never has real particles to deal 
with in a mechanical problem. Probably the closest approach 
is found in the kinetic theory of monatomic gases, where the 
atoms act like movable points exerting forces on each other. 
But often very large bodies can act as particles, as, for instance, 
the planets in their motions about the sun. Then again we can 
have essentially complicated systems, like pendulums, or weights 
suspended on springs, which yet have such simple motions that 
we can apply the methods of the mechanics of particles to them. 
Many of the problems we have treated so far have been of this 

A particle has three coordinates, which may be x, y, z, and the 
problem of mechanics is to find the way in which these coordinates 
change with time. The starting point is Newton's second law of 
motion, giving the accelerations, or second time derivatives of 
the coordinates, in terms of the forces; All of our problems so far, 
whether dealing with actual particles or not, fall under this 
classification, and in fact belong to the more restricted class of 
one-dimensional problems, with but one coordinate x. The next 
few chapters will be devoted to the two- and three-dimensional 
cases of mechanics of a particle. 

The one-dimensional motions of a particle fall into different 
classes, depending on the type of force acting. We have treated 
several sorts of forces: viscous resistances, linear restoring forces, 



external forces which are arbitrary functions of time. That 
is, the force may be a function of velocity, of position, or of time, 
or, of course, of all three combined. Most common mechanical 
problems are of this type, the force depending on v, x, and t, but 
this is not necessary. For instance, in radiation problems, in 
electromagnetic theory, one meets a force proportional to the 
time derivative of acceleration, or to d 3 x/dt 3 , which turns out to 
act much like a viscous resistance. But such cases are rare. 

The simplest cases are those in which the force depends only 
on the coordinate. Then, in one-dimensional motion, we can 
always introduce a potential energy, which added to the kinetic 
energy gives a total energy that stays constant, expressing the 
conservation of energy. If, on the other hand, there are external 
impressed forces, the energy may increase or decrease with time, 
depending on whether the impressed forces do work on the system 
or have work done on them; while, if there are frictional forces, 
the energy will decrease with time, being dissipated in heat, for 
which reason these forces are called dissipative forces. It is 
plain that the study of different types of forces is closely tied up 
with the idea of energy, which we so far have not discussed, and 
we turn to this question, first deriving the mathematical formula- 
tion of kinetic energy for one-dimensional problems. 

27. Mechanical Energy. — Let us see where the concept of 
energy comes from, and how we can use it. We start with a 
particle of mass to, acted on by a force F. Then Newton's 
second law is md 2 x/dt 2 = F. Now let us multiply each side by 
dx/dt, and integrate with respect to t, from time t up to t: 

f l dxd 2 x 7 , C 1 ^ dx , 
Both these integrals can be transformed. First, we note that 

d/dx\ _ 9 

dt~dt 2 ' 

Thus the left side is 

to C l d(dxV_. _ m(dx\ 2 \ l 

2j t() dt\~di) dt ~ 2\dt) | ( ; 

or letting dx/dt be denoted by v, and its value at t = t by v , this 
side is mv 2 /2 - mv 2 /2. On the right, j F dx/dt dt = / F dx, where 


now the integral is from x to x, if x is the value of x at t = t , 
x at t. Then the equation is 

2 mV 2 

wwo 2 = j F dx. , (1) 

* »/a;o 

The quantity wv 2 / 2 is called the kinetic energy, JF dx is the work 
done, and our equation says that the work done by the force on 
the particle between two instants of time equals the increase in 
kinetic energy during the time. This is the fundamental propo- 
sition relating to energy, and our proof is the standard one. 

Next we consider the nature of the force F. First there is 
the case where it depends only on the position of the particle, 
as in a gravitational field or a linear restoring force, without 
friction. Then F = F(x), and we may write f Fix) dx = 
- Vix), so that mv 2 /2 + V(x) = mv^/2 + V(x Q ). The quantity 
V(x) is called the potential energy, and the sum of it and the kinetic 
energy is the total energy; our equation states that the total 
energy remains constant during the motion. The lower limit 
x of integration may be chosen in an arbitrary way, or an arbi- 
trary constant of integration may be added to the potential 
energy, without changing the results, which depend only on 
potential differences. The potential energy is related to the 
force either by the equation above, or by its derivative, F = 

In case the force depends on the velocity as well as the position, 
the situation is quite different. Then the value of F cannot be 
predicted when x is known, so that we cannot even evaluate the 
work done without knowing more details about the system. In 
such a case it is plainly impossible to set up a potential energy 
function independent of time, or to speak of the total energy 
being conserved. Such a system is called nonconservative, in 
contrast to a conservative system in which the energy stays 
constant. Even in a nonconservative system it is often possible 
to write a potential function connected with part of the force. 
Thus with a damped oscillator, we can write a potential function 
for the restoring force, but not for the viscous resistance. In 
such a case we shall still speak of the sum of the kinetic and 
potential energy as being the total energy, but we can no longer 
say that it remains constant. Rather we should say that the 
time rate of change of the energy was equal to the rate of working 


of outside forces, both of viscosity and of any external impressed 
forces, on the system. Let us see what this means mathe- 
matically. Let F = -dV/dx + G, where V is the potential 
function for that part of the force derivable from a potential, 
and G is the remaining force. Then the energy is rav 2 /2 + V. 
The time rate of change of the energy is 

K^) + i FWI -("» L+ © ,; 

using ^ = ^>|. »*(.£ + £) -*, b, Newton, 

second law, so that the time derivative of energy reduces to Gv, 
or the external force times the velocity, as we should expect. 

One should not be disturbed to find systems whose total energy 
does not stay constant. At first sight they seem to contradict 
the general law of conservation of energy, but on closer examina- 
tion we always find that they are parts of a larger system whose 
energy really is conserved. Thus if we consider not merely the 
damped vibrating particle but also the viscous fluid doing the 
damping, we shall find that the latter gains the energy lost by 
the former, transforming it into heat, itself a form of energy. 
It is in fact a general situation that there are two ways of treating 
a mechanical problem: first, by considering the whole system, 
and treating it as a conservative system; second, treating only 
part of the system, and taking the forces exerted by the rest on 
this part as being impressed or dissipative forces, which cannot 
be derived from a potential. 

28. Use of the Potential for Discussing the Motion of a System. 
The one-dimensional motion of a particle in a conservative field 
can be discussed with great ease by the use of the potential 
function. Suppose we know V as a function of x, and suppose 
that we inquire about the motion of a particle of total energy E 
in this potenti al field. Then we have mv 2 /2 = E - V, v = 
■y/2{E — V)/m. Since this is a known function of x, we can 
find the speed at every point. In the first place, we can use this 
to get an explicit solution of the problem. For writing v = dx/dt, 
and integrating, we have 


V2(E - V)/m ' { } 

giving a relation between t and x, involving two arbitrary con- 


stants (energy, and the constant of integration, determining the 
origin of time, or the phase). Thus for instance for a particle 
moving under gravity, where V = mgx, we have 




V2E/m - 2gx 
Letting z = 2E/m — 2gx, so that dz = —2gdx, this is 

1 C dz Wz i , 

2gJ Vz g g y 

where evidently t is the value of t at which 2E/m — 2gx = 0, 
or x — E/mg, which, as we readily see, is the highest point of 
the path, at which the body commences to fall. If we let this 
value of x be xo, then, squaring, we have x — x = —jg(t~ t ) 2 , 
the familiar solution. Many one-dimensional problems can be 
solved by this method, as for instance the pendulum with large 
amplitude, which leads to an elliptic integral. On the other 
hand, there are, of course, many cases where the integration is 
too difficult to carry out. 

Even if the solutions cannot be obtained exactly, however, 
we can still use the. method of the energy to get general informa- 
tion about the problem. Let us imagine V plotted as a function 
of x (see Fig. 6) . Then we draw on the same graph a horizontal 
line at height E. The square root of the difference between the 
two curves is then proportional to the velocity of the particle 
at that point. Thus the velocity is only real where this difference 
is positive, and is imaginary elsewhere. If the velocity is only 
real in certain regions of x, this means that the motion can only 
occur within those regions. As the particle approaches thg edge 
of such a region, the speed gets smaller and smaller, and finally 
at the edge the particle stops. Then it reverses and travels 
away again. The possibility of going either toward or" away 
from the boundary comes from the two signs of the square 
root : the velocity at a given point of space is always the same in 
magnitude, but can be in either direction. If now the region 
where the kinetic energy is positive is bounded at both ends, 
then after reversing its motion at one edge, the particle will travel 
to the other, reverse, come back, and repeat the process indefi- 
nitely. Since at a given point the particle always travels, with 
the same speed, it will always require the same time to traverse 
its path, and the motion will be periodic. Thus, if the total 



energy is Ei (Fig. 6), the motion is periodic, confined between 
c and e. If it is E 2 , either of two periodic motions is possible, 
between b and /, or between h and j. This is a general result 
for a conservative motion in one dimension which does not 
extend to infinity. 

If, on the other hand, the kinetic energy remains positive in 
one direction all the way to infinity, but becomes negative at a 
finite point in the other direction, the particle will come in from 

Fig. 6. — Potential energy V as function of coordinate x. 
Total energy E\, periodic motion between c and e. 
Total energy Ei, periodic motion between b and /, or h and j. 
Total energy Ei, nonperiodic motion between a and infinity, reversing at a. 
Total energy Ei, nonperiodic nonreversing motion. 

infinity, having taken since negatively infinite time to do it, will 
reverse, and will return to infinity. This is the case with energy 
E s , the particle coming in from the right, speeding up in the 
region about i, slowing down about g, speeding up about d, 
finally coming to rest at a, and reversing, traveling back to the 
right. An example of the first, periodic case is a particle vibrat- 
ing in simple harmonic motion, and of the second nonperiodic 
case is a ball coming from infinity, hitting a wall, and being 
bounced back again, or a ball thrown up in the air and coming 
down again. Finally there is the possibility of a potential such 
that the kinetic energy is positive at all values of x. Then the 
particle persists in one direction forever, like a free particle, but 
generally travels with a variable velocity. Such a case is found 
for energy E A , where the particle starts from infinite distance in 
one direction, travels toward the center, speeds up and slows 


down corresponding to the maxima and minima, and finally goes 
to infinity in the other direction. It is to be noted that motions 
in the same potential field, but with different total energy, can 
have quite different characteristics under this classification. 
Thus oscillatory motions are always possible around minima in 
the potential energy, for small enough total energy. But it 
may be that for too high total energy the particle will be able to 
get entirely away from the neighborhood of the minimum, and 
will go to infinity. In Fig. 6, there are three points, d, g, and i, 
at which the force is zero, and the particle would stay at rest 
forever, if it were placed at one of these points. Of these, g 
is a position of unstable equilibrium, and a small impact would 
start the particle oscillating, about either d or i. On the other 
hand, d and i are both points of stable equilibrium, so that a 
particle at rest at either of these points would suffer only small 
oscillations about that point if struck a small impact. 

29. The Rolling-ball Analogy. — A simple model which shows 
the properties of one-dimensional motion can be set up as 
follows. We imagine a track, like a roller coaster, set up, shaped 
just like the potential curve. Then we start a ball rolling on 
this track, starting from rest at a given height. Its motion will 
then approximate that of a particle in the corresponding potential 
field. The reason is. that, since gravitational potential is pro- 
portional to height, the ball actually has the potential at any 
point which it should, and correspondingly the correct speed. 
The only approximations made, other than friction, consist in 
neglecting the fact that part of the kinetic energy actually goes 
into up and down motion, and part into rotation, instead of all 
into horizontal motion. From such a model, we can see how 
motion may be oscillatory, if the track rises on the far side of a 
dip up to the height where the ball started, or how it can go to 
infinity if the track continues permanently at a lower level. 
We can also see the general character of the solution in the case 
where there is damping, just by imagining that the ball is subject 
to friction. Obviously the motion still will have the character 
of the undamped motion, but corresponding to a continually 
decreasing energy. Thus with an oscillatory motion the ampli- 
tude will constantly decrease until it stops, while with a motion 
which originally was not oscillatory it may be possible that it 
become trapped in a minimum of potential, settle down to oscil- 
late*, and eventually come to rest. In any case, if the damping 


continues, the motion will eventually stop, at a minimum of 

30. Motion in Several Dimensions. — So far, we have treated 
only the motion of a particle in one dimension. If it can move 
about in two- or three-dimensional space, however, the problem 
becomes much more difficult. Suppose the coordinates of a 
particle are x, y, z, so that its motion is described by finding x, y, 
and z as functions of time. Force, acceleration, are vectors, and 
our first task is to investigate vector analysis enough to deal 
with these quantities. We shall find that in two and three 
dimensions it is by no means true that all force fields, in which 
the force is a function of the position alone, can be derived from 
a potential function. The next chapters, then, will deal with 
vectors, force fields, and potentials. When we come to the 
equations of motion, we find separate equations for each com- 
ponent : if F x , F y , F z represent the components of force along 
the axes, we have 

m % - "■' m ^ = F " m % - F " (3) 

a set of simultaneous differential equations. Such equations 
can be solved in a few simple cases. For instance, if F x depends 
only on x, F y only on y, F z only on z, they are simply three 
independent equations, which we can handle by the methods 
already used. This is called the method of separation of vari- 
ables, and much of our effort will be directed toward this method 
of solution. We shall carry out methods of changing to arbitrary 
coordinate systems, with a view to separating variables. For 
instance, in motion under a force acting toward a center of attrac- 
tion, we introduce polar coordinates, and in these the equation 
for r is separated from those for the angles, so that we can solve. 
The process of changing coordinate systems leads us to Lagrange's 
equations, the equations of motion in generalized coordinates. 
Finally, the method of energy, the rolling-ball analogy, and the 
other methods of the present chapter, can be used in several 
dimensions, and provide the best means for a qualitative discus- 
sion of a problem. 


1. Take the sinusoidal solution for the displacement of a harmonic oscilla- 
tor, find the velocity from it, compute kinetic and potential energy as func- 


tions of time, and add to show that the sum remains constant. Show that 
the energy is proportional to the square of the amplitude. 

2. Proceed as in Prob. 1, but for the damped oscillator, rinding the sum of 
kinetic and potential energies, showing that it decreases with time. Com- 
pute the time rate of change of the energy, find the rate of working of the 
frictional force, and show by direct comparison that they are equal. 

3. Let a particle move in a field whose potential is — 1/x + 1/x 2 . Show 
by graphical methods that for small total energy the motion is oscillatory, 
but that for larger energy it is nonperiodic and extends to infinity. Find 
the energy which forms the dividing line between these two cases. Compute 
the limiting frequency of the oscillatory motion as the amplitude gets 
smaller and smaller (using the results of Prob. 1, Chap. I), and describe 
qualitatively how the frequency changes when the amplitude increases. 

4. Solve directly the problem of the motion of a particle moving in a 
field of potential —l/x + 1/x 2 , using the energy integral. Show that the 
mathematical solution has the physical properties found in Prob. 3. 

6. Using the solution of Prob. 4, find the period of oscillation of the oscil- 
latory solutions in the potential — l/x + 1/x 2 , as functions of the energy. 
To do this, note that the two ends of the path are the values of x for which 

■\/2(E — V)/m = 0. Thus the integral I , from one of 

J*o V2(# - V)/m 
these points to the other will give just the half period. Show that the 
period approaches the value found in Prob. 3 for small oscillations. 

6. In an electric circuit, show that one can set up a magnetic energy 
\U 2 analogous to a kinetic energy, and an electric energy \q 2 /C analogous 
to a potential energy. Show that the rate of change of this total energy 
equals the rate of working of the resistance and the applied electromotive 

7. An atom acts like a particle held to a position of equilibrium by a 
definite restoring force and a viscous resistance. An external light wave 
exerts a sinusoidal force, the atom executing a forced vibration under the 
influence of the wave. Show that the atom continually absorbs energy from 
the wave, the energy going into the viscous resistance. Show that the rate 
of absorption is proportional to the component of amplitude out of phase 
with the force, which we have already connected with the absorption 

8. Solve the problem of the undamped oscillator, by using the equation 
t = / dx/s/2{E - V)/m. 

9. Discuss the problem of the pendulum with arbitrary amplitude by the 
graphical method. Show that for low energies the motion is oscillatory, 
but for high energies it is a continuous rotation. Sketch the qualitative 
form of curves for angular displacement as a function of time, for several 
energies, in both the oscillatory and rotatory ranges. 

10. Set up the problem of the pendulum by the method of Prob. 8, and 
show that t as a function of the angle is given by an elliptic integral. (Hint: 
Use the information about elliptic integrals given in Peirce's table; note 
that 1 - cos 6 = 2 sin 2 §0.) 



In our one-dimensional problems, we have had no occasion 
to mention vectors; however, before we can treat the detailed 
theory of motion in two or three dimensions, we must discuss 
them, and their relation to such things as potential energy. 

31. Vectors and Their Components. — The force, in two- or 
three-dimensional motion, is a vector, and we must make a 
study of the mathematical relations of vectors. In the first 
place, a vector is often denoted by its components along three 
axes at right angles, as F x , F y , F z . Vectors, in the second place, 
obey the following law of addition: if two vectors F and G have 
components F x , F y , F z and G x , G v , G z , respectively, the components 
of the sum F + G are (F x + G x ), (F y + G y ), (F z + G z ). A 
graphical discussion shows that this is equivalent to the familiar 
parallelogram law of addition (as in Fig. 2, where the same 
proposition was shown for complex numbers, regarded as vectors) . 
Third, if we multiply a vector by a constant, as C, each compo- 
nent is multiplied by this constant. Thus the components of 
CF are CF X , CF y , CF Z . Often a constant like C is called a 
"scalar," to distinguish it from a vector. A scalar is a quantity 
which has magnitude but not direction, a vector having both 
magnitude and direction. 

It is often useful to write vectors in terms of three so-called 
unit vectors, i, j, k. Here, i is a vector of unit length, pointing 
along the x axis, and similarly j has unit length and points along 
the y axis, and k along the z axis. Now we can build up a vector 
F out of them, by forming the quantity iF x + jF y -f kF z . This 
is the sum of three vectors, one along each of the three axes, and 
the first, which is just the component of the whole vector along 
the x axis, is F x , and the other components likewise are F v and 
F z . Thus the final vector has the components F X) F y , F z , and 
is just the vector F. 

By the magnitude of a vector we mean its length. By the 
three-dimensional analogy to the Pythagorean theorem, by which 
the square on the diagonal of a rectangular prism is the sum 



of the squ ares on the thre e sides, the magnitude of a vector F 
equals -\/F x 2 + F y 2 + F z 2 . We often speak of unit vectors, 
i.e., vectors whose magnitude is 1. 

The component of a vector in a given direction is simply the 
projection of the vector along a line in that direction. It evi- 
dently equals the magnitude of the vector, times the cosine of 
the angle between the direction of the vector and the desired 
direction. As a special example, the component of a vector F 
along the x axis is F x , and this must equal the magnitude of F, 
times the cosine of the angle between F and x. If this angle is 

called (F, x), then we must have cos (F. x) = -— , 

' VF X 2 + F y 2 + F 2 

with similar formulas for y and z components. The three cosines 

of the angles between a given direction, as the direction of the 

vector F, and the three axes, are called direction cosines, and are 

often denoted by letters I, m, n, so that in this case we have I = 

cos (F, x), etc. It follows immediately that I 2 + m 2 + n 2 = 1. 

We can make a simple interpretation of the direction cosines of 

any direction: they are the components of a unit vector in the 

desired direction, along the three coordinate axes. 

32. Scalar Product of Two Vectors. — Multiplication of two 

vectors is a rather special process, and there are two entirely 

independent products, called the "scalar product" and the 

"vector product." We shall first consider the scalar product. 

The scalar product of two vectors F and G is denoted by F • G, 

and by definition it is a scalar, equal to either (1) the magnitude 

of F times the magnitude of G times the cosine of the angle 

between; or (2) the magnitude of F times the projection of G on 

F) or (3) the magnitude of G times the projection of F on G. 

From the last section we see that these definitions are equivalent. 

It is often useful to have the scalar product of two vectors in 

terms of the components along x, y, and z. We find this by 

writing in terms of i, j, and k. Thus we have 

F • G = (iF x + jF y + kF z ) ■ (iG x + jG v + W.) 
= (i ■ i)F x G x + (i ■ j)F x G y + (i • k)F x G, 
+ U ■ i)FyG x + (j ■ j)F v Gy + (j ■ k)F y G z 
+ (k • i)FJG x + (k • j)F z G y + (k • k)FjG z . 

But now by the fundamental definition, 

i • i = j • j = k • k = \, 

ij=j'i=j'k = k-j = k-i = i'k = 0. 




F ' G = F X G X ~\~ FyGy -f- F Z G Z 



The scalar product has many uses, principally in cases where 
we are interested in the projections of vectors. For example, 
the scalar product of a vector with a unit vector in a given direc- 
tion equals the projection of the vector in the desired direction. 
The scalar product of a vector with itself equals the square of its 
magnitude, and is often denoted by F 2 . The scalar product of 
two unit vectors gives the cosine of the angle between the 
directions of the two vectors. To prove that two vectors are 

at right angles, we need merely 
prove that their scalar product 

33. Vector Product of Two Vectors. 
The vector product of two vectors F 
and G is denoted by (F X G), and 
by definition it is a vector, at right 
angles to the plane of the two vec- 
tors, equal in magnitude to either. (1) 
the magnitude of F times the mag- 
nitude of G times the sine of the 
angle between them; or (2) the mag- 
nitude of F times the projection of Q on the plane normal 
to F; or (3) the magnitude of G times the projection of F on the 
plane normal to G. We must further specify the sense of the 
vector, whether it points up or down from the plane. This is 
shown in Fig. 7, where we see that F } G, and F X G have the 
same relations as the coordinates x, ?/, z in a right-handed system 
of coordinates. Another way to describe the rule in words is 
that, if one rotates F into G, the rotation is such that a right- 
handed screw turning in that direction would be driven along 
the direction of the vector product. From this rule, we note one 
interesting fact: if we interchange the order of the factors, we 
reverse the vector. Thus (F X G) = -(GXF). 

We can compute the vector product in terms of the compo- 
nents, much as we did with the scalar product. Thus we have 

F XG = {iF x +jF y + kF g ) X (iG x + jG y + kG z ) 
= (i X i)F x G x + (* X j)F x G y + (i X k)F x G z 
+ (j X i)F v G x + (j X j)F y G y + (j X W 
+ (k X i)FJG* + (k X i)F z G y + (k X k)Ffi 9 . 

7. — Direction of the vector 


But now, as we readily see from the definition, 

iXi=jXj = kXk = 0, 

(as, in fact, the vector product of any vector with itself is zero) ; 

iXj = -U X i) = k,j Xk = -(k Xj) = i, 

kXi = -(iXk) = j. 

Hence, rearranging terms, we have 

FXG = i(F y G z - F z G y ) + j(F z G x - F X G Z ) + 

k{F x G y -F y G x ). (2) 

As an example of the use of the vector product, we may men-, 
tion the angular momentum vector. If we have, as in Chap. V, 
a particle of mass m, velocity v (a vector), and we wish its 
angular momentum about a certain center, we must take m times 
the magnitude of the radius vector times the projection of v at 
right angles to the radius. But this is just m times the magni- 
tude of the vector product of r and v. Further, this vector 
product is a vector pointing along the axis of rotation, and in a 
positive direction if the rotation is positive, or counterclockwise, 
so that it is just in the direction conventionally assigned to the 
angular momentum. Hence we have angular momentum 
= m(r X v). 

Another example of the use of the vector product comes fre- 
quently, when we may wish to prove two vectors to be parallel. 
To do this, we need only show that their vector product vanishes. 

34. Vector Fields.-— Very often in physics one has vectors 
which are functions of position. There are two particularly 
common examples, a force field, and a velocity, or flux density, 
in a flowing fluid. In an electric or magnetic or gravitational 
field, for instance, the force on unit charge or pole or mass at 
any point of space is a vector, of components F x , F y , F z , varying 
from point to point in both direction and magnitude. Often 
such a vector field is indicated graphically by introducing lines 
tangent at every point to the vector at that point, called lines of 
fojfce or lines of flow, as the case may be. We shall discuss the 
nature of vector fields more in detail in connection with hydro- 
dynamics and the flow of fluids, in Chap. XVII. Our present 
application is to force fields, and our main interest is to discover 
in what cases the force vector can be derived from a potential 


function. To investigate this, let us consider the energy theorem 
in three dimensions, deriving the work done in an arbitrary 

35. The Energy Theorem in Three Dimensions. — Let us start 
with the equations of motion of a particle in a force field, 

d 2 x „ 

m a¥ = F « 

d 2 v 
mj - F„ 

m % = F,. (3) 

Multiplying by dx/dt, dy/dt, dz/dt, respectively, and integrating 
with respect to time, we have as in the last chapter 

\m v 2 x — %mv 2 x0 = jF x dx, 
\m v\ - \m v\ = jF y dy, 
\m v 2 z — \m v 2 zo = JF a dz. 

\m (v 2 x + v\.+ v 2 z ) - \m (v 2 x0 + v* v o + v 2 z0 ) 

= j(F x dx +F y dy + F z dz). (4) 

Now v x 2 + Vy 2 + v. 2 is the square of the magnitude of the velocity, 
or is v 2 . Thus the left side of Eq. (4) is the final kinetic energy of 
the particle minus the initial kinetic energy, so that the integral 
on the right should be the work done. The integrand is evidently 
a scalar product : the product of the vector F, and the infinitesi- 
mal displacement vector of components dx, dy, dz, which we may 
call ds. The scalar product, which is F • ds. is the displacement 
times the projection of the force in the direction of motion. This 
is what is ordinarily called the work done, since only the compo- 
nent of force along the motion does work. The integral is simply 
the sum of all the infinitesimal amounts of work done, or is the 
total work done, as in one-dimensional motion. 

36. Line Integrals and Potential Energy. — The integral 
JF ' ds is called a line integral, for its evaluation demands the 
knowledge of a definite path between starting point and end 
point, as well as of the function F. In general this integral will 
depend on the path as well as the end points. For instance, 
suppose the lines of force went in circles, as in Fig. 8. Then the 
work done along the path ABC is positive, since the force and 



displacement are parallel; along ADC the work is negative, since 
force and displacement are opposite; while along A EC it is zero, 
force and displacement being at right angles. Intermediate 
paths would yield any value we chose for the work done. Hence 
we surely could not define a potential, for the work done between 
A and C could not be set equal to the difference of potential in 
any unique way. In discussing one-dimensional motion, we 
saw that a potential depending only 
on position could not be introduced 
if the force depended on velocity, 
time, or anything except displace- 
ment. Here the condition is more 
stringent : we cannot have a potential, 
even if force depends only on posi- 
tion, unless the integral JF • ds is 
independent of path. If this condi- 
tion is satisfied, however, we can 

Set Up a potential energy V, SUCh Fig. 8— A nonconservative 

that -JF-d, from some standard {T^TVES/M 
point where the potential is zero, up along abc is positive, along 
to the point we are interested in, t!%XZ£%Jl*£Z 

equals V. Evidently another way of A and C is not independent of 

stating the criterion for existence of pat ' 

a potential is that the work done in taking a particle about 
any arbitrary closed path, or JF • ds where the integral is about 
a closed curve and back to the starting point, be zero. Still a 
third condition, easier to apply in actual cases, will be derived in 
a later section. 

37. Force as Gradient of Potential. — Let us suppose that it is 
possible to set up a potential function V in a given case. We 
know how to write V as the negative line integral of F. Now we 
ask the opposite question: Given V, how do we find F? Let us 
suppose that we are at a given point of space, and that we allow 
the coordinates to increase by small amounts dx, dy, dz, forming 
a vector ds, while at the same time we exert a force — F to balance 
the force of the field. Then first, we shall do the work — F • ds 
on the system ; second, the potential will increase by the amount 
dV — V(x + dx, y + dy, z + dz) — V(x, y, z). These must be 
equal; and writing the scalar product as F s \ds\, where \ds\ is the 
magnitude of the displacement, F s the component of F parallel 
to the displacement, we have 


dV = -F-ds = -F.\ds\, F s = -S- (5) 

A derivative of the sort occurring in Eq. (5), where we take the 
difference of a scalar function like V at two neighboring points, 
divide by the magnitude of the displacement, and pass to the 
limit, is called a directional derivative, for evidently its value 
depends on the direction in which the displacement is made. We 
thus have the result that the component of force in any direction 
is the negative directional derivative of the potential in the 
desired direction. 

The x component of force is determined from the directional 
derivative of V along the x direction. To find that, we allow x 
to increase by dx, keeping y and z fixed; divide the difference 
V(x + dx, y, z) — V(x, y, z) by dx; and pass to the limit as dx 
becomes small. But this is simply the partial derivative of V 
with respect to x. We see, in other words, that a partial deriva- 
tive of a function is merely a special case of a directional deriva- 
tive, in which the direction is along one of the coordinate axes. 
Using this fact, we then have 

Fm = _*?, F, = -g, F, = _£. (6) 

dx dy dz ' 

The three partial derivatives in Eq. (6) are evidently the com- 
ponents of a vector, called the gradient of V, and abbreviated 
grad V. Thus 

Air dV _l ,- dV _l i, dV n\ 

grad V = ^^+J^ + k~ , (7) 

and we may write a vector equation 

F = -grad F. (8) 

38. Equipotential Surfaces. — Let us take a displacement ds in 
a direction tangent to an equipotential surface, or surface on 
which V is constant. Then no work is done, so that dV = 0. 
But also F • ds . = 0. If this is so, then F and ds must be at right 
angles. Thus we have proved that the force, and hence the lines 
of force, are at right angles to the equipotential surfaces. Any 
scalar function of position can be described by a set of surfaces, 
like equipotentials, on which it is constant. We see then that 
the gradient of such a function is a vector, at right angles to the 


equipotentials, measuring the rate of change of the function in 
this direction. The name gradient comes from contour maps in 
two dimensions. There the contours are lines of constant alti- 
tude, and the ordinary gradient of a slope is the rate of change 
of height with horizontal distance, in the direction at right angles 
to the contours, or the direction of steepest slope. In our case, 
the gradient points in the direction in which the function 
increases, while the force, being the negative gradient of the 
potential, points in the direction in which the potential decreases. 

39. The Curl and the Condition for a Conservative System. — 
Let F x = -dV/dx, F y = -dV/dy. Differentiating the first 
with respect to y, the second with respect to x, we have dF x /dy = 
— dW/dydx, dFy/dx = — dW/dxdy. But by the fundamental 
theorem of partial differentiation, these two are equal, so that 
dF x /dy = dFy/dx. Similarly we have two other equations. 
These can be combined in a single vector equation. We shall 
find that it is useful to set up a vector called the curl, according 
to the definition 

(dF, _ dj\\ (dF_ x _dF J \ (dFy _ dF x \ 

\ dy dz) + \dz dx) + \ dx ~dy~)- (9) 

Then our three equations are combined in the one vector equa- 
tion curl F = 0. These form relations between the components 
of force, which plainly must be fulfilled if there is a potential. 
Yet it is by no means true that any set of forces will satisfy 
these conditions. The vanishing of the curl at all points of 
space, then, is a necessary condition which F must satisfy, if 
it is derivable from a potential. It can be proved that it is 
also a sufficient condition, so that it is the criterion which we 
desired, telling whether a potential can be set up in a given 
problem or not. As we shall see in a problem, the nonvanishing 
of the curl of a vector in general means whirlpool-like lines of 
force, as in Fig. 8. 

40. The Symbolic Vector V. — We have seen two vector dif- 
ferential operators, the gradient and the curl. These can both 
be expressed conveniently in terms of a symbolic vector operator 
V, equal to (i d/dx + j d/dy + k d/dz). Of course, this operator 
by itself has no meaning, but its interpretation is that it is always 
to be followed by some other quantity, and the differentiations 
are to be performed on this quantity. Thus if we have a scalar 
V, the quantity VV is a vector, equal to 

curl F = i\ 


_ T/ (.b .b ,,d\ v .bV , .dV ,,dV 

grad 7. (1G) 

Similarly, if we have a vector j^, the vector product (V X F) is 
equal to 

<rx F )-(fr-l'-)+{i*--&) + 

{&•- If) -*«**■ <»> 

In the course of time, we shall meet several other vector 
operations, which can be expressed in terms of V. We shall 
merely define them now, though we shall have many applications 
later. If we have a vector F, the scalar product of V with F, or 
(V ■ F), is a scalar, evidently equal to 

This is called the divergence of F, abbreviated div F. Again, if 
we have a scalar 7, and take two factors V multiplied by 7, 
or (V • V) 7, the result is 

(4x + % + k Q • ( { i + 4 + k & v = 

\d:r 2 ^ dy 2 ^bz i ) bx 2 ^ by 2 ^ bz 2 V - Kl6) 

This is called the Laplacian of 7, and there is no usual abbrevia- 
tion, except V 2 7, which evidently is equivalent to the method 
of writing above. Clearly V 2 7 = div grad 7. Finally we can 
take the Laplacian of a vector: if F is a vector, 

V 2 F = /**= + *F? + ^A + /^ + **■ + ^ + 
\ dx 2 ^ dy 2 ^ bz 2 J^ \ dx 2 ^ by 2 ^ bz 2 J ^ 

k y A J + ~^J + 

b 2 F z , b 2 F z 
bx 2 by 2 


1. Find the angle between the diagonal of a cube and one of the edges. 
(Hint: regard the diagonal as a vector i + j + h.) 

2. Given a vector i + 2/ + 3A;, and a second i — 2j + ak, find a so that 
the two vectors are at right angles to each other. 


3. Let F x = y,F v = -x,F z = 0. Prove that this vector field represents 
a force tangent to circles about the origin in the xy plane. Compute JF ■ ds 
around such a circle. 

4. Find the curl of the force in the preceding problem. Discuss the 
question as to whether it is a conservative field or not. 

5. In the gravitational field of a mass m, the potential is given by —m/r, 
where r is the distance from the mass, given by r 2 = x 2 + y 2 + z 2 , if the 
mass is at the origin. Obtain the components of the force vector by direct 
differentiation. Find the curl of the force, and show that it is zero. 

6. Find which ones of the following forces are derivable from potentials, 
and describe the physical nature of the force fields. Set up the potential 
in cases where that can be done : 

(a) F x = -^—j F y = -f^-v F z = 0. 

x 2 + y 2 x 2 + y 2 

(b) F x = ' y , F v = . ~ X . -, F. = 0. 

Vx 2 + y 2 Vx 2 + y 2 

(c) F x = xf(r), F y = yf(r), F z = zf(r), where fir) is an arbitrary function 
of the distance from the origin. 

id) F x = Mx), F v = My), F z = /,(*). 

7. Prove that Ix + my + nz = k, where I, m, n, k are constants, and 
J2 _|_ TO 2 ^_ n 2 — i } i s the equation of a plane whose normal has the direction 
cosines I, m, n, and whose shortest distance from the origin is k. 

8. Taking the potential field from Prob. (5), find the line integral $F>ds 
around a square of arbitrary size in the xy plane, with the origin at its 
center. Show by direct calculation that the integral always vanishes. Do 
the same for a path made up as follows: the part of the square of side 2a, 
made of lines at x = —a,y= ±a, which lies at negative values of x, and 
the part of the circle of radius a, center at the origin, which joins onto and 
completes the figure for positive x's. 

9. Prove that A ■ (B X C) = B ■ (C X A) = C • (A X B), where A, B, C 
are any vectors. Show that these are equal to the determinant 

A x 


A z 

B x 


B z 

c x 


C z 

10. Prove that A X (B X C) = B(A ■ C) - C(A • B), where A, B, C 
are any vectors. 

11. Prove that div aF = a div F + (F ■ grad a), where a is a scalar, F a 

12. Prove that curl aF = a curl F + [(grad a) X F], where a is a scalar, 
F a vector. 

13. Prove that div (F X G) = (G • curl F) - (F • curl G), where F, G are 

14. Prove that div curl F = 0, where F is any vector. 

15. Prove that curl curl F = grad div F — V 2 F, where F is any vector. 



In considering mechanical problems with several variables, it 
is seldom very convenient to use ordinary rectangular coordinates. 
In working with problems in the motion of particles, we often 
wish to introduce curvilinear coordinates, as for instance polar 
coordinates. With rigid dynamics, we often use rather com- 
plicated quantities to give the orientation of a rigid body in 
space. For instance, with a top or gyroscope, we may use 
Euler's angles, namely, the latitude and longitude angles of the 
axis of the top with reference to a fixed north pole, and the angle 
of rotation of the top about its own axis. All these coordinates 
come under the general description of generalized coordinates. 
Any quantities which are capable of describing the positions 
of the parts of a system, whether they be distances, angles, or 
any other quantities, can serve as generalized coordinates. 
Now when we begin to examine the equations of motion in 
generalized coordinates, we naturally find that they can be 
very complicated. In a later section we shall start with the 
ordinary equations of motion in rectangular coordinates, intro- 
duce new coordinates as functions of the old, and find the new 
equations of motion by direct change of variables. We find 
many new terms coming in, as soon as the change of variables 
is at all complicated. But we shall find that there are several 
fairly simple ways of writing the equations of motion, different 
in form from Newton's equations, though essentially identical, 
which preserve their simple form even in generalized coordinates. 
The most elementary of these methods is that of Lagrange's 
equations, and we consider them in this chapter. 

41. Lagrange's Equations. — We start our discussion of 
Lagrange's equations merely by restating Newton's second law 
of motion, in a slightly different way. For the moment, we 
consider only problems where there is a potential energy function. 
Since F x = —dV/dx, etc., the equations of motion, written 
in terms of momenta, are 



d(mv x ) _ _dV (i\ 

dt dx 

etc. There is an interesting way in which these equations can 
be written. Let the kinetic energy be called T, so that 

T = | (IV s + v y * + v z >), (2) 

if it is written in terms of the velocity components. If we keep 
this form, we observe that mv x = dT/dv x , which we note is the 
x component of momentum. Hence we can write our equations 

i/^+^.o, (3) 

dt\dvj ^ dx 

etc. But this can be put in another form, if we let T — V = L, 
called the Lagrangian function (and different from the total 
energy, which is T + V). T is to be considered a function 
of the velocity components, and V of the coordinates, so that L 
is a function of all these six variables. Since T depends only 
on the v'a, and V only on the x's, we have dT/dv x = dL/dv x , 
dV/dx = — dL/dx, etc. Hence the equations of motion are 

d/MA_3L (4) 

dt\dv x / dx 

with similar equations for y and z. In this form, the equations 
are called Lagrange's equations of motion, and they are simply 
convenient ways of writing Newton's second law of motion. 

As we have stated, the importance of Lagrange's equations is 
that they hold in any sort of coordinates, not merely in rectangu- 
lar coordinates. Thus, if the coordinates are q x . . . q n , and 
their time derivatives are qi . . . q n , the equations are 

l(?Ii) - ?k = (5) 

dt\dqij dqi 

Here as before L = T - V, but now it is no longer true, as 
before, that T depends only on the velocities, V only on the 
coordinates. Instead, T generally involves the coordinates 
as well, so that the term dL/dqi has some contributions coming 
from dT/dqi, which are evidently absent in rectangular coordi- 
nates. We shall see by an example that these terms are a sort 
of fictitious force introduced by using the generalized coordinates, 


and of which the centrifugal force in polar coordinates is a typical 
case. We postpone a proof of Lagrange's equations to a later 
section, giving first an example of their usefulness by discussing 
i.he motion of a particle in a central field, as a planet about the 

42. Planetary Motion. — As an example of two-dimensional 
motion, and of the Lagrangian equations, we consider the case 
where V = V(r), a function only of the distance r from a given 
point. This problem is almost impossible to discuss completely 
if we use rectangular coordinates, but if we take polar coordinates, 
r, 0, we find that we can separate variables, and that the problem 
is then easily solved. To apply Lagrange's method to this case, 
we write L as a function of r, 0, r, and 0. Then we have 

d(dL\ _ dL = 
dt\df/ dr ~ ' 

d/dL\ dL n 

First we find L. The velocity is made up of two vector compo- 
nents at right angles, along the radius and along the tangent to 
a circle. The first is r, the second rd, so that v 2 = f 2 + r 2 2 , and 

L = T - V = ^(r 

rH 2 ) - V(r). 



^-r = mr, 

— = mr J 6, 

^— = mrb 2 — 



Then the equations are 

fairnr) - mrd 2 + -^ = 0, 

~{mrH) = 0. (8) 

The second may be immediately integrated: mr 2 § — constant. 
This has a simple interpretation, for mr 2 6 is simply the angular 
momentum, since mr 2 is the moment of inertia, 6 the angular 


velocity, and our equation states that it is constant, since no 
torque is acting. As a matter of fact, dL/dqi is called the 
generalized momentum associated with the generalized coordinate 
q if and linear and angular momenta are special cases of the 
generalized momentum. Let then mr 2 6 = p, where p is a 
constant (momenta are conventionally called p, as coordi- 
nates are called q). Next we may consider the first equation, 
m d 2 r/dt 2 = mrd 2 — dV/dr. The first term on the right-hand 
side is at first unexpected. But when we look at it, we see that 
it is the centrifugal force, which must be added to the external 
force to produce the radial acceleration. 

We can now solve our equations. Setting mr 2 6 = p, we have 
6 = p/mr 2 , so that m d 2 r/dt 2 = p 2 /mr 3 — dV/dr = —d/dr(V + 
p 2 /2mr 2 ). We have separated the variable r from 0, and the 
result is just like the equation for a one-dimensional problem with 
a potential V + p 2 /2mr 2 , the latter being a sort of fictitious poten- 
tial energy coming from the centrifugal force. For example, if 
the force is a gravitational one, V = —Gmm'/r, where m' is the 
mass of the attracting body, G the gravitational constant, so 
that we have the problem of the apparent potential —Gmm'/r -f- 
p 2 /2mr 2 . Except for the constants, this is the case of the poten- 
tial — (l/x) + 0-/x 2 ), which we have already taken up in Probs. 3 
and 4, Chap. V. We showed there that motions of negative 
energy are oscillatory in r, so that the orbit is concentrated in a 
finite region, and motions of positive energy go to infinity. We 
leave the exact discussion to a problem, but it proves to be true 
that the finite orbits are periodic and are ellipses with the attract- 
ing center at one focus, while the open orbits are hyperbolas. 
This is, however, a special case, and we proceed to a qualitative 
discussion of the general central motion, by the method of energy. 

43. Energy Method for Radial Motion in Central Field. — We 
have seen that the radial motion of a particle in a central field 
is just like the one-dimensional motion of a particle in a potential 

V + (p 2 /2mr 2 ), where p is the constant angular momentum. 
This problem can be discussed as in Chap. V, plotting the curve 

V + (p 2 /2mr 2 ) as a function of r, and drawing the horizontal 
line at height E, as in Fig. 6. Aside from this, we can make no 
general statement. But in many important physical cases, the 
curve resembles A or B in Fig. 9, the rise at r = arising from 
the centrifugal force, and the potential V representing attraction 
in A, repulsion in B. With energy E h in either case, the motion 



would come in from infinity to a smallest distance (c or d), called 
the perihelion, from the astronomical analogy, perihelion meaning 
near the sun. It would then reverse, and travel outward for 
infinite time. The energy E 2 , however, would represent no 
possible motion with the curve B, but with the attractive poten- 
tial A, which resembles the gravitational attraction mentioned 
in the preceding section, there would be oscillatory motion 
between the perihelion a and the aphelion b. This motion 

Fig. 9.— Curves of 7 + 

2mr 2 

as functions of r. Case: A, attraction; B, 

repulsion. With energy Ei, motion goes to infinity with either potential; with 
Ei, motion impossible with curve B, oscillatory between limits a and b with curve 

would be periodic, and the radius as function of time, and like- 
wise the period, could be computed by the method of the energy 
integral discussed in Chap. V. 

44. Orbits in Central Motion. — The best picture of central 
motion is obtained by considering the orbit in space, as in Fig. 10. 
Suppose we consider a motion oscillatory in r, as the case E 2 
of Fig. 9. Then we may draw two circles, of radii equal to 
the perihelion and aphelion distances, respectively, and the 
motion will take place between the circles. The orbit must be 


tangential to both circles, as shown. If the motion starts 
on the outer circle, the particle will move with continually 
decreasing radius until it touches the inner circle. At the same 
time, however, on account of the angular momentum, it will 
be turning around, and the angle made by the radius vector 
will have turned through a definite amount between the points 
of contact with outer and inner circles. After touching the inner 
circle, the whole procedure is reversed, r increasing to the maxi- 

Fig. 10. — Orbit of a particle in central motion. 

mum value, so that after a certain time the point will touch the 
outer circle again. 

Now between the two successive points where the orbit touches 
the outer circle, there will be a certain length of arc. It may be 
that this is a rational fraction, say m/n, of the circumference, 
where m and n are integers. In that case, after n excursions 
to the center and out again, the aphelion point will have gone 
around the circle m times, and will have come back to the starting 
point. Thus the motion is periodic, repeating itself after a 
certain length of time. For example, if the particle is attracted 
to the center according to the inverse square, m/n is just 1, 
and the particle always comes back to the same point on the 
circle. But if the length of arc is an irrational fraction of the cir- 
cumference, as in Fig. 10, the motion is not periodic, and will 
never repeat itself. Nevertheless, it is what is called doubly 
periodic. The motion resembles a slowly rotating ellipse, 


rotating so that successive aphelion points, instead of lying on 
top of each other, are displaced with respect to each other by a 
given angle. This slow rotation is called precession, and one 
can find the frequency, and angular velocity, of the precessional 
motion. If now we imagined a turntable to rotate with the 
precessional frequency, and traced out the motion on this turn- 
table, the path would be closed, somewhat like an ellipse. In 
other words, the whole motion is a combination of a periodic 
motion, superposed on a rotation. These two motions have in 
general entirely independent frequencies, and that is the origin 
of the statement that the motion is doubly periodic. 

45. Justification of Lagrange's Method. — We shall now show 
in our special case of polar coordinates how Lagrange's method 
could be justified, using this as a model for the general treat- 
ments Surely the equations of motion are 

dH bV d>y _ _bV 

m W* ~ dx' m dt* ~ by' 

We introduce the polar coordinates, x = r cos 8, y = r sin 8. 
Then dx/dt = cos 8 dr/dt — r sin 8 dd/dt, 

d 2 x d*r a . a drd6 . Q d 2 8 Jd8\ 2 

jTa = -TPl cos 8 - 2 sin %^-rsin^-r cos 8\^J , (9) 

dp-dP™" "^"dtdt ,a ^"dt 

d2 y _ d ' r „•„ a _i_ o _ A de L . _ a d " e _ . „•„ Jm\ 

dt* - dt* sind + 2 cos e dtdt +rcose d¥ - rsin *l - ' • (10) 

Using these, we can obtain the equations of motion in x and y. 
But now multiply Eq. (9) by cos 8, Eq. (10) by sin 8, and add. 
The result on the left is w d 2 r/dt 2 — mr(dd/dt) 2 , and on the right 
— (dV/dx cos 8 + bV/by sin 8), which is just — bV/br, since 
the latter should be — (bV/bx bx/br + bV/by by/br), and 
bx/br = cos 8, by/br = sin 8. Thus we have the first of 
Lagrange's equations. Next, multiply Eq. (9) by — r sin 8, 
Eq. (10) by r cos 8, and add. On the left, we have 2mr dr/dt dd/dt 
+ mr 2 d 2 8/dt 2 , which equals m d/dt(r 2 d8/dt) f and on the right we 
have r bV/bx sin 8 - r bV/by cos 8 = -bV/b8. Thus the 
second equation becomes m d/dt(r 2 dd/dt) = — bV/bd, the second 
of Lagrange's equations (whose right member is zero in the case 
of a central field). 

Just such a change of variables can be carried out in the general 
case. Suppose that, for the sake of simplicity, we still take only 


two dimensions; the general proof goes through in just the same 
way, except with more complicated expressions. We start with 
two rectangular coordinates x and y, in terms of which we have 
the ordinary Newtonian equations m d 2 x/dt 2 = — dV/dx, 
m d 2 y/dt 2 = — dV/dy, and two generalized coordinates gi and 
q 2f given as functions of x and y, so that q x = qi(x, y), 
q 2 = q 2 ( x , y), or conversely we can write x and y as functions of 
4i and q 2 : x = x(q h q 2 ), y = y(qi, 92). We must remember 
carefully what these quantities are functions of, in taking partial 
derivatives. Now we have 

dx dx dqi dx dq 2 

~dt ~ dqi dt dq 2 dt' 

dH _ dx d 2 qi dx d 2 q 2 

dt 2 ~ dqi dt 2 + dq 2 dt 2 

dqi/j^x dqi d 2 x dq 2 \ 

+ ~dt\dqi 2 dt + dqidq 2 dt ) 

dq 2 / d 2 x dqi d 2 x dq 2 \ 

+ ~dt\dq x dq 2 dt + dq 2 2 dt ) 

with a similar equation for d 2 y/dt 2 . In terms of these, we set up 
the equations m d 2 x/dt 2 = - d V/ dx, etc. Then we multiply thr 
first by dx/dq h the second by dy/dqi, and add. We have 

ny dx v , /jyyygi 4. /i£ ^ _l ^ J*]L\^ 

m \[\dqi) + \d qi ) J dt 2 ^ \dqi dq 2 "*" dqi dq 2 ) dt 2 

(d%_ d^x dy_ d 2 y\/dqi\ 2 
+ \dqi dqi 2 ^ dqi dqi 2 )\dt ) 

+ \dq x dq x dq 2 "*" dqi dq x dq 2 ) dt dt 
,(jtejPxdy_ Vy V^Yl 

"*" \dqi dq 2 2 ~*~ dqi dq 2 2 )\ dt ) j 
/dV dx dV dy\ _ _dV_ 
\ dx dqi dy dqi) dqi 

It will next be shown that the rather complicated expression 
on the left is equal to 

d/dT\ _ dT } 
dt\dqi) dqi 

where T is the kinetic energy. To do this, we first have 


2[^agri dz "^ a? 2 d< J "*" \d qi dt + ag 2 dt ) 

Then by differentiation, remembering that q x = dgi/dfc, 



— m 


. dq x dqi 
dt dt 

/ dx dqi 
\dqi dt 

+ ~~ dgA cte /_dy dgi jfy dgAdy 


dg 2 ^ y^gi 

i/ |_ W dt* ^ dq 2 dt* )dq x + 1 ^ ^7F + — — »— 

d / Bx^ 

\dq x dt ' dq 2 dt J dqi 

dy d 2 q 2 \ dy ' 

\dqi dt 2 "■" dq 2 dt 2 J dqi 



dt J 

_dq\dqij "*" dq\dqi) J 

A/iiY + jl/iy Y+ J_/^£ _^\ . JL/ifo ** Y 

dq 2 \dqij ^ dq\dqi) ^ dq\dq t dq 2 ) "*" dgi^dgi a? 2y / 

_a /3a; a^x a / ay g y \l) 

_dq 2 \dqi dq 2 ) + d^dgi dtfajjj 


d?i ■ l\^i <# ag 2 dt )\dqi 2 dt + agiag 2 <ft j 
/ay dqi dy^ dqA/ tPy dqi d 2 y dq 2 \l 

\dqi dt ~ 1 ~ dq % dt )\dqi 2 dt ~^~ dqidq 2 dt ) \ 


Combining these two expressions, it is easy to see that we have 
just the quantity which we desired. We have then the equation 







If we set L = T — V, and remember that, since V does not 
depend on the velocities, dV / dqi = 0, this becomes 



Similarly we can prove the equa- 

or Lagrange's equation for q x . 
tion for q 2 . 

It is worth remarking that the method which we have used 
for proving Lagrange's equations, though straightforward and 
simple in principle, is not the one usually employed. More often 
a derivation is given using the calculus of variations, which avoids 
most of the algebraic complications, but which on the other hand 
is more difficult in the fundamental ideas involved. 



- 1. A particle of mass mis attracted to a center by a force —Gmm'/r 2 . Find 
perihelion and aphelion distances as a function of energy and angular 
momentum. Assuming that the orbit is an ellipse, prove that its major 
axis is —Gmm'/E. 

2. In Prob. 1, show that it is possible for perihelion and aphelion distances 
to be equal, so that the orbit is circular. Find the necessary relation 
between energy and angular momentum for this to happen, and check this 
relation by elementary discussion, balancing the centrifugal force in the 
circular motion against the attraction. 

3. A particle in an inverse square field executes an elliptical motion with 
the center of attraction as a focus. Find the period of this motion, by 
considering the radial motion, proceeding as in Prob. 5, Chap. V, using the 
results of that problem if you wish, but finding the period in terms of energy 
and angular momentum. 

4. Discuss in detail the motion of a planet about a sun, proving that, if 
the energy is negative, the orbit is elliptical with the sun at a focus, and 
finding the relations between the major and minor axes of the ellipse and the 
energy and angular momentum. A procedure for the discussion is sug- 
gested as follows: 

Assuming the angular momentum to be p = mr 2 = constant, show 

v 2 T /du\ 2 ~| 1 du 

that the energy is ^- I -r ) + u 2 — Gmm'u, where u = — Find -^ 

from the equation of an ellipse in polar coordinates, with one focus as a pole, 

which is u = • — jz. ^—> where a is the semi-major axis, e the eccentricity, 

a(\ — e 2 ) 

so that b, the semi-minor axis, is given by b 2 /a 2 = 1 — e 2 . Substituting 
your value of du/dd into the expression for energy, show that the result is a 
constant, independent of B, and equal to E, if the major axis and eccentricity 
are properly chosen. 

5. Suppose a particle of mass to, charge e, collides with a very heavy 
particle which has charge e', so that it repels with a potential energy ee'/r. 
The first particle is moving with a velocity v at a great distance, and is 
aimed so that, if it continued in a straight line, it would pass by the center 
of repulsion at a minimum distance R. Note that this determines the 
angular momentum. Using the energy method, find the perihelion distance 
as a function of R and the velocity of the particle. 

6. Discuss in detail the motion of the particle of Prob. 5, showing that it 
will be deflected so that after the collision the line of travel will make an 

angle d> with the initial direction, where tan jr- = j=- Such deflections are 

° 2 mvo 2 R 

observed in collisions between alpha particles and atomic nuclei, in Ruther- 
ford's scattering experiments. 

Suggestions: the particle executes a hyperbolic orbit, and the desired 
angle is the angle between the asymptotes. Now the equation of a hyper- 
bola in polar coordinates is just like that of an ellipse, as given in Prob. 4, 
except that the eccentricity is greater than 1, so that the term 1 — e cos 
can become zero, and r infinite, giving the angles of the asymptotes in 


terms of e. We need then only determine e in terms of energy and angular 
momentum, from the equations found in Prob. 4. 

7. A two-dimensional linear oscillator is attracted to a center by a force 
proportional to the distance, or F x = — ax, F v = —ay. Solve in rectangu- 
lar coordinates, separating variables, showing that x and y execute independ- 
ent simple harmonic vibrations of the same frequency. Prove that the 
resulting orbit is an ellipse, with its center at the center of attraction. 

8. Taking the solution of Prob. 7 in rectangular coordinates, find the 
angular momentum vector by ordinary vector formulas from the displace- 
ment and velocity, and prove by direct computation that it remains con- 
stant. Find the angular momentum as a function of the dimensions of the 
elliptical orbit, and show its connection with the area of the orbit. 

9. Set up the problem of the two-dimensional linear oscillator, as in Prob. 
7, using polar coordinates. Separate variables, solve the radial problem 
by the energy method, compute the period in this way, and show that it is in 
agreement with the period as found in Prob. 7. 




In the last chapter we have found the equations of motion 
in generalized coordinates, but we have not considered the mean- 
ing in these coordinates of the simple concepts of momentum and 
force. We shall accordingly examine these questions, and shall 
see that the equations can be interpreted in the form that the 
force equals the time rate of change of momentum, which as we 
know is a more fundamental statement than that it is the mass 
times acceleration. Using the momentum, we can then restate 
the equations in a form called Hamilton's equations, equivalent 
to Lagrange's equations, but more powerful in some applications 
to advanced mechanics. 

46. Generalized Forces. — In many mechanical problems we 
have to deal with forces which cannot be derived from a potential. 
Let us see how such forces may be included in the Lagrangian 
scheme. For simplicity we take a two-dimensional problem, 
and let the x and y components of force be F x and F v , which may 
depend on time, velocity, etc., as well as position. For gen- 
erality, we assume that part of the force can be derived from a 
potential, the rest not, so that we have F x = — (dV/dx) + 
F» t etc., where FJ is the part of the force not derivable from a 
potential. Now if we proceed with the proof of Lagrange's 
equations as in the last chapter, we easily find 

dt\dqj dq x dqi\ * dq x "*" v dqj 

with a similar equation for g 2 . We may introduce as before a 
Lagrangian function, containing the part of the external forces 
derivable from a potential: L = T — V. Then 

d(dL\ dL „ ,'dx , w ,dy _ n m 

dt\dqi/ dq x dq x aq x 

with a similar equation for qz, where Qi, Q2 are called the gen- 


eralized forces connected with the coordinates q 1} q 2 . The equa- 
tion in this form may be used to discuss any arbitrary problem, 
for example of damped motion, in generalized coordinates. 

It is worth noting that these generalized forces are closely 
related to the work done in an arbitrary displacement, just as 
ordinary forces are in rectangular coordinates. For imagine the 
generalized coordinates changed by amounts dq h dq 2 . There 
will be a certain amount of work done on the system, equal to 
— dV + dW, where dW is the work done by the external non- 
conservative force F' (a force is spoken of as conservative if it is 
derivable from a potential, nonconservative otherwise). Now 
in general we have 

dW = F x 'dx + F v 'dy 

-(<+ '■'$*+ ('■'£+ '-'£)* 

= Q\dq x + Q 2 dq 2 , (2) 

or the sum of products of generalized forces by generalized dis- 
placements. It is, of course, plain that all these arguments work 
equally well with more than two generalized coordinates. 

The forces Q which we have just introduced were the external 
applied forces not derivable from a potential. But we may well 
consider all the forces together. We could write Lagrange's 
equations as 

dt\dqij l dqi dg* 

The three terms on the right of Eq. (3) may be taken to be three 
terms of the force. The first is the generalized force not derivable 
from a potential, the second the force derivable from a potential, 
the third the fictitious force, like a centrifugal force, arising from 
the fact that the coordinate system is not rectangular. Equation 
(3) states that this total force equals the time rate of change of a 
certain quantity, and it seems reasonable to consider this quantity 
as a generalized momentum. 

47. Generalized Momenta. — In simple cases the quantity 
dL/dqi plays the part of a momentum. Thus in rectangular 
coordinates, we have dL/dx = mx, or exactly the momentum 
associated with the coordinate x. Similarly in polar coordinates 


the quantities associated with r and 6 are mf, the radial momen- 
tum, and mr 2 9, the angular momentum, respectively. These 
are but examples of a general rule, and as a matter of fact we 
define dL/dqi to be the generalized momentum associated with 
the coordinate q i} denoting it by pi. We note that generalized 
momenta are not of the same dimensions as ordinary momenta, 
in general; they are not simply components of the momentum 
referred to other coordinates. Similarly generalized forces are 
not simply components of forces. For instance, it is easily shown 
that in polar coordinates the generalized force Q r is the com- 
ponent of force along r, but Q e is the moment of force, or torque, 
which by Eq. (3) above equals the time rate of change of the 
angular momentum. 

48. Hamilton's Equations of Motion. — Assuming no external 
forces Q, we could evidently write Lagrange's equations in the 
form dpi/dt — dL/dqi — 0, or dp { /dt = dL/dq t , which, taken 
together with the definitions p { = dL/dqi, would form a complete 
system. But there is a neater method, known as Hamilton's 
method, which we use instead. We can first see how Hamilton's 
equations are set up in rectangular coordinates. There we have 
T = (m/2)(x 2 + y 2 + z 2 ). Then it is true that we have, for 

dL dT 

Vx = -rr = tt = mx. 
dx dx 

We can also write T, not in terms of the velocities x, y, z, but 
in terms of the momenta p x , p y , p z . Since x = p x /m, we have 

T(P*> Vv, p.) = ^(P* 2 + pS + p."), 

where we must specify that T is a function of the p's. Then we 

dT(p x , p y , p z ) _p x _ mx . 
dp x m m ' 

and similarly dT(p)/dp v = y, dT(p)/dp z = z. These take the 
place of the equations p x = dT{x, y, z)/dx, etc. 

Now in Hamilton's method we set up what is called the Hamil- 
tonian function H. This is in all ordinary cases simply the total 
energy T + V, in which T is expressed in terms of the momenta, 
rather than the velocities. Thus we have H = H{q i} p t ), mean- 


ing that it is a function of the coordinates and momenta. Then 

dH = dT ,9V 
dqi dqi dqi 

which in rectangular coordinates gives dV/dq< = — dL/dq i} so 
that in this case Lagrange's equation becomes dpi/dt = - dH/dqi. 

dH = dT = . = dqi 

dpi dpi Qi dt' 

The resulting equations are called Hamilton's equations: 

dqi = dH 

dt dpi 
dpj = _dH ... 

dt dq t w 

It is evident that they show a symmetry between p { and q if which 
is one reason for preferring them over Lagrange's equations. 
For a given problem, there are twice as many Hamiltonian 
equations as Lagrangian equations, but they are only first-order 
rather than second-order differential equations, so that it comes 
down essentially to the same thing. 

49. General Proof of Hamilton's Equations. — Our proof holds 
only in rectangular coordinates, and we must next give a general 
proof. As before, we start with Lagrange's equations, which we 
assume are correct, and we define the momenta as derivatives of 
the Lagrangian function with respect to the velocities. Then we 
set up the Hamiltonian function in terms of the Lagrangian 
function, by the equation 

h = 5)p^ - L - (5) 


This seems at first quite different from our elementary definition 
of H as the energy, but we shall show in the next paragraph that 
it is equivalent. We express the Hamiltonian in terms of coordi- 
nates and momenta, writing the velocities q h where they appear 
both in Sp,-gy and L, in terms of the momenta, so that we have 

H = JjpMvk, qk) - L[qj(Pk, qt), Qil 

Then we have 

HE = a 4- ^* a Jk - ^SOk b Jl 

d Vi qi ^ ^J v, d Vi ^Jdq f dp- 

i i 

But since by definition pj = dL/dq h the last two terms cancel, 

dH = . = dqi 

dpi Qi dt' 

d qi ^ Pj dq { ^idijdqi dq~- 

i i 

This time the first two terms cancel, leaving dH/dqi = — dL/dq if 
so that by Lagrange's equations. 

dH = _dpj 
dqi dt 

Thus we have proved both of Hamilton's equations in the general 

It remains to be shown that the Hamiltonian function, as we 
have defined it, is the same as the total energy. First we consider 
the kinetic energy expressed in terms of the velocities. This is a 
homogeneous quadratic function of the velocities : 




where the A's are coefficients depending in general on the coordi- 
nates, and we are to sum over all possible values j and k. In 
particular, for rectangular coordinates, A jk = ra/2 if j = k, if 
j 7* k. In cases where the coordinates are orthogonal, that is, 
the coordinate surfaces intersect at right angles, as they do, for 
instance, in spherical polar coordinates, or in fact in all the 
coordinate systems in common use, only square terms come in, 
all coefficients A jk being zero if j ^ k. But in oblique coordinate 
systems, this is not true. Now for such a homogeneous quad- 
ratic expression we have the theorem 


r\/T7 ■ 

which we can immediately prove. For ^r- = ^S(Aa + A a) fa, 

we can immediately prove, .for 

so that 

i i j 

The double sum is now just twice the sum of Eq. (6) which we 
previously gave as T, proving the theorem. Hence, using 

dT/dfa = dL/dqi = pi, we have T = li^Pifa, so that our defini- 


tion in Eq. (5) of H gives H = 2T-L = 2T-T+V = 
T + V = total energy, as we wished to prove. 

In advanced work, one sometimes meets cases where H is not 
equivalent to the total energy. Such cases are found, for 
instance, where magnetic forces are present. But even here, 
the following general rules are correct : 

First, set up a Lagrangian function, so that the equations of 
motion can be written in Lagrangian form. This can sometimes, 
as in the magnetic case, be done, even if we cannot interpret 
the Lagrangian function as T — V; for in the magnetic case, the 
forces are not derivable from a potential, depending rather on the 
velocity, and yet vary in such a way that we can use a Lagrangian 

Next, define the momenta as pt = dL/dfa. 

Set up the Hamiltonian function Zpifa — L, expressing it in 
terms of coordinates and momenta. 

Then Hamilton's equations hold, using this Hamiltonian. 

50. Example of Hamilton's Equations. — Let us by way of 
illustration work out Hamilton's equations for the problem of 
planetary motion, discussed in the previous chapter by Lagrange's 
method. In terms of the coordinates r and 6, we found that 

L = ^(r 2 + r 2 2 ) - V(r). 

Then the momenta are p r = dL/dr = mf, the ordinary momen- 
tum along the radius, and p 8 = dL/dd = mr 2 d, the angular 
momentum. Next we have 

2p»ft — L = {mf)f + (mr 2 e)6 — L 

= m(r* + r 2 2 ) - ^(r 2 + r 2 2 ) + V(r) 


= ^(r 2 + r 2 2 ) + V(r) 

= total energy. 

Solving the equations for r and 6 in terms of p r and p e , we have 
f = Pr/m, 8 = pe/mr 2 , and substituting these in the Hamiltonian, 
we have 

H = U* + ;*•') + F(r >- 

Then Hamilton's equations are 

di7 _ Pr _ dr _ . 
dp r m dt 

dps mr 2 dt ' 
both of which we already knew. Also 

_dH = pf _ dV(r) = dpr } 
dr mr 3 dr dt 

showing that the time rate of change of radial momentum equals 
the external force — dV/dr in the r direction, plus the centrifugal 
force p 8 2 /mr 3 (which evidently equals mrd 2 = mw 2 r = mv 2 /r). 

— — - - dve 

dd dt' 

showing that the time rate of change of angular momentum is 
zero, on account of the absence of torques. 

51. Applications of Lagrange's and Hamilton's Equations. — 
From our discussion one might get the impression that the only 
use of Lagrange's and Hamilton's equations was in introducing 
curvilinear coordinates in problems of the dynamics of a particle. 
This is, however, far from the case. For example, one may have 
a particle moving subject to certain constraints, as a bead sliding 
along a frictionless wire, or a particle constrained to move on the 
surface of a sphere or other surface, as the bob of a spherical 
pendulum must move in a sphere. Then we may often satisfy 
the conditions of constraint by suitable choice of the generalized 
coordinates. Thus, with the spherical pendulum, we may take 
spherical polar coordinates r, 6, <f>. We may then arbitrarily 
set r constant, equal to R, the radius of the sphere, and write 


Lagrange's equations for 6 and <f>. To justify this, we note that 
the component of the external and centrifugal force normal to 
the sphere will be exactly balanced by the reaction of the con- 
straint, just as the weight of a body resting on a table is exactly 
balanced by the upward push of the table. Thus the generalized 
force acting in the direction of r will be zero, so that a constant 
value for R, leading to a constant and vanishing generalized 
momentum along r, is a solution of the equations. For a 
particle on a wire, similarly, if the wire happened to be a circle, 
we could take polar coordinates, set r constant, and have but 
one equation of motion, stating that the torque acting on the 
particle equaled the time rate of change of its angular momen- 
tum. We note that these two problems are essentially equiva- 
lent to the spherical and ordinary pendulum, which are rigid 
bodies, suggesting that Lagrange's equations are of use in dis- 
cussing the motion of a rigid body. But we can go even further. 
An Atwood's machine, for instance, is a special case of coupled 
systems, two weights being hung by a string over a pulley. This 
can be described very easily by a single generalized coordinate. 
In the general problem of coupled systems, and in fact in all 
problems of interaction of different particles or systems, 
Lagrange's method is very suitable, as we shall see. In fact, 
there is hardly a mechanical problem where generalized coordi- 
nates are not applied. 

For the actual solution of problems, Hamilton's equations are 
generally not so convenient as Lagrange's equations. Their 
importance comes in the insight they give into the situation, 
by bringing the momenta directly into the statement of the 
equations, and for their relation to more advanced mechanics. 
The applications are principally to three fields: celestial 
mechanics, statistical mechanics, and quantum theory. We 
shall indicate in the next chapter the nature of some of these 
applications of Hamiltonian methods, taking up some of the 
general properties of the motion of particles, but postponing 
until later in the book the discussion of statistics and of quantum 



1. An Atwood's machine is built as follows: A string of length h passes 
over a light fixed pulley, supporting a mass m x on one end and a pulley of 
mass m 2 (negligible moment of inertia) on the other. Over this second 
pulley passes a string of length U supporting a mass m 3 on one end and m 4 on 
the other, where m 3 ^ w 4 . Set up Lagrange's equations of motion for this 


system, using two appropriate generalized coordinates. From these show 
that the mass rwi remains in equilibrium if 

(m 4 — m 8 ) 2 

mi = m 2 + TO 3 + TO4 ; 

m 3 + mi 

2. A particle slides on the inside of a smooth paraboloid of revolution 
whose axis is vertical. Use the distance from the axis, r, and the azimuth 
as generalized coordinates. Find the equations of motion. Find the angu- 
lar momentum necessary for the particle to move in a horizontal circle. 
If this latter motion is disturbed slightly, show that the particle will perform 
small oscillations about this circular path, and find the period of these 

3. Set up the kinetic energy, Lagrange's equations, and Hamilton's 
equations in spherical polar coordinates. Set up expressions for the general- 
ized forces acting on r, 0, and <j>, and for the generalized momenta, explaining 
the physical meaning of these quantities. 

4. Set up the problem of a spherical pendulum subject to gravity and 
to a resisting force proportional to the velocity and opposite in direction. 
Use spherical polar coordinates. Show that for small amplitudes and no 
damping this problem reduces to the two-dimensional linear oscillator of 
Prob. 9, Chap. VII. 

6. Derive the Hamiltonian equations for Prob. 4, in the general case, 
showing that the damping forces give extra terms in the equations propor- 
tional to the momenta. Show that these equations in general cannot be 
separated. Derive a solution, however, for the special case in which the 
instantaneous motion would be a rotation about the lowest point of the 
sphere if damping were absent. Assume small damping, so that the actual 
motion is a gradual spiralling in toward the lowest point. 

6. The force on an electron of charge e, moving with a velocity d in a 

magnetic field H, is given by F = -(v X H), where c is the velocity of light. 

This corresponds to the ordinary motor law, in which the force on a circuit 
is proportional to the current (here ev/c) and to the field, and at right angles 
to both. In addition, the magnetic field H can be given as the curl of a 
vector A, called the vector potential. Show that the equations of motion of 
an electron moving in such a magnetic field, and in addition in a potential 
field of potential V, can be described by Lagrange's equations, with the 
Lagrangian function/^ = T — V + (e/c)(v -A}. Assume the vector poten- 
tial, and magnetic field, to be independent of time, but note that 

^ _ ^ .dAahc dAdy dAdz- , dA _ 
dt ~ dt ~ l ~ dx dt + dy ~dt + ~dz W re ~dt ~ 

7. For the particle of Prob. 6, set up the momentum and the Hamiltonian 
function. Show that the momenta do not equal mass times velocity, and 
the Hamiltonian does not have the form p 2 /2m + V. 

8. In the relativity theory, the equations of motion of a particle are 
different from what they are in classical mechanics, though they reduce to 
the same thing for small velocities. In particular, the mass of a particle 


increases with velocity. If a particle has a mass m when at rest, its mass 
at speed v is given by 


TO = r, 

VI - *> 2 /c 2 

where c is the velocity of light; reducing to m in the limit v/c = 0, but 
becoming infinite when the particle moves with the speed of light. 

Show that the equations of motion are correctly given from the Lagran- 
gian function — WoC 2 \/l — v 2 /c* — V, when we remember that the momen- 
tum equals the velocity times the (variable) mass. 

Derive the Hamiltonian function from the Lagrangian function. Setting 
the Hamiltonian function equal to T + V, where T is the kinetic energy, 
&how that the Lagrangian function is not equal to T — V, as is natural 
from the fact that the kinetic energy is not a homogeneous quadratic func- 
tion of the velocities. Taking the kinetic energy, expand in power series 
in the quantity v/c, showing that for low speeds the kinetic energy 
approaches its ordinary classical value, except for an additive constant moc 2 . 
This additive constant, which always appears in relativistic energy expres- 
sions, is interpreted as meaning that the mass of the particle is really equiva- 
lent to energy, 1 gm. being convertible into c 2 ergs of energy. 



As in one-dimensional motion, we can make a great deal of 
use of the energy in discussing motion in two and three dimen- 
sions. In a conservative system with potential energy V, the 
motion can occur only in those regions of space where E — V 
is positive, if E is the total energy, and we can thus divide up 
our possible problems into those occurring within a finite region 
and those going to infinity. As with one-dimensional motion, 
there are sometimes periodicity properties associated with the 
finite motions, which we discuss in the present chapter. With 
two-dimensional motion, we can visualize the use of the energy 
very easily, plotting 7 as a height in a three-dimensional graph, 
the result looking like a relief map, or else drawing equipotentials, 
which represent the potential as the contour lines represent 
height on a map. For a total energy E, we imagine the map 
filled with water up to a level E, so that the submerged parts, 
lakes and oceans, represent the regions where the motion occurs. 
We may also use the analogy of the rolling ball in two dimensions 
as well as in one, imagining that a ball starts rolling down the side 
of a hill in our relief map, climbing up the valley on the other 
side, and oscillating back and forth. From physical intuition 
as to the motion of such a ball, we can derive much information 
about complicated forms of motion. 

There is one great complication present in motion in several 
dimensions which was absent in one-dimensional motion. In 
that simpler case, the velocity of a particle was determined at 
each point of space in a conservative motion, only the direction, 
forward or back, being arbitrary. Here, however, while the 
magnitude of the velocity is still determined, there are an infinite 
number of possible directions associated with the same magni- 
tude. To describe a motion completely, then, even if we know 
. its energy, we must give as well the velocity components, or 
else the momenta, at each point of the path. This is accom- 



plished by describing the motion, not in ordinary space, but in 
the so-called phase space, in which there are dimensions asso- 
ciated with both the generalized coordinates and the generalized 
momenta. And the importance of Hamilton's equations arises 
from the fact that they are peculiarly suited to a discussion of 
motion by means of the phase space. 

52. The Phase Space. — For a system with n degrees of freedom 
and n generalized coordinates qi . . . q n , the phase space is a 
2n-dimensional space in which q x . . . q n and p x . . . p n are 
plotted as variables. A single point in this space, often called a 
representative point, then determines all coordinates and veloci- 
ties of the system. As time goes on, the representative point 
moves about the space, as both coordinates and momenta change 
with time. It is here that we make connection with Hamilton's 
equations, for these equations, dqi/dt = dH/dpi, dpi/dt = 
— dH/dqi, give just the components of the velocity of the 
representative point in the phase space. The problem of dynam- 
ics is to investigate the path of the representative point in the 
phase space. 

We can easily see some properties of this motion. In the first 
place, it takes place with constant energy, assuming that we are 
dealing only with conservative forces. To prove this, we have 
for the time rate of change of H 

dH = dHdqidHdq2 dH dp x dH dp 2 

dt d<7i dt + dq 2 dt + dp x dt ~*~ dp 2 dt i ~ 

_ dHdH ,dHdH _ dH, 

~ dq\ dpi dq 2 dp 2 dp 

= 0. 

+^+ • • • +4-f )+f (-1 )+ 

Now the energy H is a function of the coordinates and momenta, 
and hence a function of position in the phase space. Thus the 
equation H = constant determines a single relation between all 
the p's and q's, and hence is the equation of a (2n — ^-dimen- 
sional hypersurface in the 2n-dimensional space. The repre- 
sentative point now moves about, but always stays on a single 
energy surface. If in addition there are other quantities which 
stay constant, as, for instance, an angular momentum, each one 
of these quantities gives an additional equation between the p's 
and q's, so that the representative point can move only in the 
intersection of all the various surfaces represented by these 
equations. Thus in some cases the region in which the motion 


occurs is of smaller dimensionality than 2n — 1. The extreme 
case is purely periodic motion; in that case there are enough 
quantities staying constant so that the motion of the repre- 
sentative point is in a single closed line in phase space, or a 
one-dimensional region. This fits in with the fact that all one- 
dimensional conservative motions not extending to infinity are 
periodic: for these have n = l, 2n — 1 = 1, so that the energy 
"surface" itself reduces to a line. Motions are possible in all 
the intermediate cases between the periodic motion, and the 
other extreme in which the representative point comes eventually 
arbitrarily close to every point of the energy surface. The latter 
type of motion is called quasi-ergodic, (ergodic motion being a 
nonexistent type in which the point passes through every point 
of the surface). Some of the intermediate types are multiply 
periodic motions, like our doubly periodic motion in the central 
field. We shall investigate some of these typical cases by means 
of examples. 

Fig. 11. — Phase space for a linear oscillator, with line of constant energy E. 

53. Phase Space for the Linear Oscillator. — As an illustration 
of one-dimensional motion, we may take a linear oscillator (see 
Fig. 11). The phase space is two-dimensional, so that the 
energy surface is really a line. If the energy is \mv 2 + 2ir 2 mv 2 x 2 , 
where we readily see that v is the frequency of oscillation, the 
Hamiltonian function is p 2 /2w + 2ir 2 mv 2 x 2 . Setting this equal 
to a constant, E, the equation of the line of constant energy is 
p 2 /2m -\- 27r 2 m»' 2 x 2 = E, or 


= 1, 


(V2mEy ' (\/#/2tW) 2 
the equation of an ellipse, having semi-axes s/2mE and 
\/E/2* 2 mv 2 . 



54. Phase Space for Central Motion. — As an illustration of a 
two-dimensional problem, we may take a central motion. The 
phase space is four-dimensional, so that we cannot directly plot 
it: the axes represent r, 6, p r , pe. But we recall that in central 
motion p e stays constant, so that we may choose a particular 
value of p , and use a three-dimensional section of the phase 
space, the axes representing r, 0, and p r . We imagine r and as 
rectangular coordinates in a plane, and p r as a coordinate at 
right angles to the plane. Now the energy surface is given by 




2m 2mr 2 

+ V(r) = E = constant, 


or solving for p r , p r = ±\/2m(E — V — p 2 /2mr 2 ). That is, for 
each value of r and (E and pe being fixed), two values of p r are 
given by the equation. If we plot these values, we get the 
surface of Fig. 12, on which the representative point moves 
in a spiral around the cylindrical surface, continually increasing, 
while r increases and decreases as the point spirals round and 
round. Although the orbit criss-crosses on itself, as we saw in 

Fig. 12. — Surface of constant energy and constant angular momentum in 
phase space of a particle moving in a central field. The spiral represents the path 
of a particle. 

Fig. 10, Chap. VII, still in the phase space the two different 
possible directions of motion at a given point of space are on 
opposite sides of the energy surface. 

In Fig. 12, we have plotted only the part of the energy surface 
between = and = 2x. The spiral, however, continues 
indefinitely. Since the regions from = 2t to 47r, 4t to Or, etc., 
all represent the same regions of space as to 2ir, it is reasonable 


to telescope these sections of the surface on to the one we have 
drawn. Each one will have its own segment of spiral, so that 
we shall have an infinite number of pieces all drawn on the surface 
shown in Fig. 12. There are now two possibilities. First, the 
motion can be periodic, as mentioned in Chap. VII. Then 
infinitely many segments of the spiral will lie on top of each 
other, resulting in one or a finite number of segments only. This 
is then a one-dimensional line in phase space, as we expect for 
periodic motion. Or second, the motion can be doubly periodic, 
the general case for this problem. In that case, the infinite 
number of segments of the spiral will not coincide, and instead 
they will fill the whole surface densely. In other words, in this 
case the path of the representative point fills a two-dimensional 
region. This is characteristic of doubly periodic motions. 

55. Noncentral Two-dimensional Motion.— Let us consider 
motion in a field only slightly different from a central one, as, for 
instance, if we had a central field and a small external field of 
some other sort superposed. Then there will be slight torques 
acting on the particle, so that its angular momentum will change 
slowly. Now a given angular momentum corresponds to a given 
surface in Fig. 12. Hence in this motion the representative 
point does not confine itself to the surface, we have drawn, but 
moves also on larger and smaller surfaces. If the motion without 
the additional torques were doubly periodic, and if no new 
regularities were introduced, the path of the representative point 
would now fill densely a whole set of surfaces with continuously 
varying sizes; that is, it would fill up a three-dimensional volume, 
the most general thing possible. In most cases this volume would 
be the whole region consistent with the energy, so that the 
motion would be quasi-ergodic. The motion itself, in two- 
dimensional coordinate space, would resemble for a short time 
the orbit of Fig. 10, Chap. VII, but the circles to which the orbit 
is tangent would slowly increase or decrease in size, the loops of 
the orbit simultaneously getting less or more rounded. If the 
departure from a central field were large, we could not use this 
approximate description, but should have to say simply that 
successive loops of the orbit were not merely oriented differently, 
but were of different size and shape. 

56. Configuration Space and Momentum Space. — It is not 
always so easy to reduce a four-dimensional phase space to three 
dimensions as it was with the central field. We can always 


visualize the phase space, however, by imagining a separate 
n-dimensional momentum space associated with each point of 
the n-dimensional coordinate or configuration space. If we 
assume that the n coordinates are the rectangular or Cartesian 
coordinates, then we have a simple interpretation of the condition 
that the representative point move on an energy surface. For 
this states that kinetic energy = E — V, or (p x 2 + p y 2 + Pz 2 ) = 
2ra (E — V). But p x 2 + p y 2 + p* is simply the square of the 
radius in momentum space, so that to a given energy and a given 
point of space corresponds a sphere (or, with two dimensions, a 
circle) in the momentum space, on the surface of which the 
representative point must move. In quasi-ergodic motion, the 
representative point at one time or another comes arbitrarily 
close to each point of the surface of these spheres. We note that 
the spheres exist, and have real radii, only in that part of con- 
figuration space where E — V is positive, and where, therefore, 
according to the energy principle, the motion can occur. But 
now in the more specialized types of motion, all points of the 
surface of the spheres are not available for representative points. 
Thus, in central motion, where the sphere degenerates to a circle 
in the two-dimensional momentum space, only those velocities 
are allowed which correspond to a given angular momentum. 
That is, p x and p y must satisfy at the same time the equations 
Px 2 + py 2 = 2m(E — V), xp y — yp x = angular momentum = 
constant, the equations of a circle and a straight line, respectively, 
in momentum space. These intersect in two points or in no 
points, so that for some parts of the configuration space corre- 
sponding to positive kinetic energy there are two possible values 
of the momentum, and for other parts there is none, and the 
motion cannot occur. These excluded regions are those within 
the small circle in Fig. 10, Chap. VII, and outside the large circle, 
but within the circle on which the kinetic energy becomes zero. 

57. The Two-dimensional Oscillator. — A second example of 
two-dimensional motion is provided by the two-dimensional 
oscillator. In Chap. VII, Probs. 7, 8, and 9, it was shown that 
a particle attracted to a center by the forces F x = — ax, F y = 
— ay, could be solved by separation of variables, each coordinate 
vibrating like a separate, oscillator, and the combined motions 
producing an elliptical orbit with the center at the center of 
attraction. The motion is periodic, with the same period which 
the corresponding one-dimensional motion would have. To 


obtain a nonperiodic motion, we make F x = —ex, F y = — ky, 
the force components being proportional to the displacements, 
but with different coefficients. It is easily seen that in this case 
the total force, regarded as a vector, is not in the same direction 
as the displacement. An example is found in the vibration of a 
rectangular stick, if one end is clamped and the other vibrates, 
since the stick is stiffer for bending in one direction than the 

Fig. 13. — Lissajous figure for the orbit of a two-dimensional oscillator. The 
ellipse surrounding the rectangle represents the equipotential corresponding to 
the energy of the motion. 

other, unless it is square. The variables are still separated in 
the equations of motion, and the solution is 

x = Ax cos (y/cjm t - «i), y = A 2 cos (Vk/m t - a 2 ). (3) 
The motion is no longer periodic, for after one period of the x 
motion, the y motion will not have traversed just a full period, 
but will be in a different phase. By plotting (see Fig. 13), one 
can see that the orbit is always within the rectangle bounded by 
x = ±Ai,y = ■ ± A 8 , that it is often tangent to the edges of this 
rectangle, and that in time it comes arbitrarily close to any point 
within the rectangle. The resulting figure is called a Lissajous 
figure, and this sort of motion is typical of many examples which 
one meets. The orbit in central motion is, in fact, a sort of 
Lissajous figure, as Fig. 10, Chap. VII, shows. 

The two-dimensional oscillator is a typical doubly periodic 
motion, the periods being just those of the two separate degrees 
of freedom. The displacements x and y are singly periodic, 
but if we wished to express, for example, the displacement in an 
arbitrary direction as a function of time, we should have. a? -f- by, 


which would be a sum of two terms, one periodic with the one 
frequency, the other with the other. Inspection of Fig. 13 
shows that at a given point of space, there are just two branches 
of the orbit, corresponding to two definite values of the momen- 
tum, rather than having all momenta consistent with the given 
kinetic energy, in all directions, permitted as in quasi-ergodic 

A small perturbation applied to the two-dimensional oscillator 
would destroy the double periodicity, and make the motion 
quasi-ergodic. Thus we might have a small central field added 
to the linear restoring force. If the perturbation were small, 
we might apply what is called the method of variation of con- 
stants. That is, we could consider the coordinates to be given 
by Eq. (3), but regard the A's and a's as slowly varying functions 
of time rather than constants. Substituting these expressions 
in the differential equations, we should find that the perturba- 
tion produced such changes of amplitude and phase, at rates 
proportional to the magnitude of the perturbation. Considered 
from the standpoint of Fig. 13, this means that the rectangle is 
gradually changing its dimensions, subject always, however, 
to the condition that it is at least approximately inscribed in 
the same ellipse, since the total energy is only slightly changed 
by the perturbation. The result then is a slowly changing 
Lissajous figure, looking therefore like a superposition of many 
such figures, filling up the ellipse, and giving at a point of space 
not two possible momenta only, but a continuous range of 
momenta, in all directions, leading, therefore, to quasi-ergodic 
motion. A similar discussion can be given for the simpler 
problem of the almost periodic oscillator. In the exactly 
periodic case, the orbit is a single ellipse inscribed in a rectangle 
like that of Fig. 13, which in turn is inscribed in an ellipse. 
If the problem is made slightly different, by introducing only 
a very small difference between the force constants in the two 
directions, the dimensions of the ellipse can be considered to 
change slowly, though it always remains inscribed in the rec- 
tangle. The actual Lissajous figure, as one sees at once by 
inspection, is very similar to what one would obtain by drawing 
a great many ellipses, all inscribed in the same rectangle. 

68. Methods of Solution. — We have seen one method of solving 
mechanical problems in several dimensions, that of separation 
of variables, by which the problem is reduced essentially to 


several independent one-dimensional problems. There are 
several problems which can be solved by this method, in addition 
to the oscillator and the central field problems which we have 
treated. The problem of a particle in the field of two attracting 
centers, both attracting according to the inverse square law, can 
be solved by separation in ellipsoidal coordinates, with the two 
centers as foci. In three dimensions, the central or the 
axially symmetrical fields can be solved by separation. The 
solutions in all these cases are multiply periodic, as we can see 
at once from the fact that each coordinate, acting like a one- 
dimensional problem, must be singly periodic. It is thus 
obvious that no problems except multiply periodic ones can be 
solved by separation, and it seems likely that the small list we 
have just given includes practically all the multiply periodic 
mechanical problems which exist in two or three dimensions. 

59. Contact Transformations and Angle Variables. — Hamil- 
ton's equations can be applied to multiply periodic motions by 
making certain transformations of coordinates which are called 
contact transformations, because it can be shown that they 
transform two curves which are in contact with each other in 
the original space into curves in contact in the new space. 
An ordinary transformation of coordinates, of the sort which 
we have discussed in connection with Lagrange's equations, is 
a transformation in which new coordinates are written as func- 
tions of the old ones: q/ = q/(qi • • • q n ), if the q's are the old 
coordinates, the q"s the new ones. The new momenta, derived 
from the new Lagrangian function, are then functions of the old 
coordinates and momenta: p/ = p/(qi • • • q n , pi ' ' ' p n )- 
Such a transformation is called a point transformation. But 
in a contact transformation, the new coordinates as well as the 
new momenta are functions of both the old coordinates and 

qj = q/(qi • • ■ q n , Pi ■ • ' p»), 

Pi = Pi(Qi ' * * Qn, Pi ' • • Pn). (4) 

There must naturally be restrictions on the functions, just as 
in ordinary point transformations we require that the new 
momenta be derived from the new Lagrangian function. When 
these restrictions are applied, however, it proves that Hamilton's 
equations are still satisfied in the new coordinates, thongh 
Lagrange's are not. Such contact transformations can often 


be very useful in complicated problems, reducing them to forms 
which can be handled mathematically. A contact transforma- 
tion can be most easily visualized simply as a change of variables 
in the phase space. For instance, suppose we have the phase 
space for a linear oscillator, as in Fig. 11. We can easily choose 
the scale so that the line of constant energy is a circle, rather 
than an ellipse. Then it is often useful to introduce polar 
coordinates in the phase space, so that the motion is represented 
by a constant value of r, and a value of 6 increasing uniformly 
with time. The angle 0, or rather 6/2t, in this case, is often 
called the angle variable, and is used as the coordinate. This 
is from analogy with the rotation of a body acted on by no 
torques, where the angular momentum stays constant, and the 
angle increases linearly with the time. The momentum conju- 
gate to the angle variable, which stays constant with time, is 
not simply the radius, as we should expect from the simple use 
of polar coordinates, but proves to be proportional to the square 
of r; in fact, it is just xr 2 , or the area of the circle. This momen- 
tum is called the action variable, or phase integral, denoted by 
J, and the angle variable is denoted by w. 

Since Hamilton's equations hold in the transformed coordi- 
nates, and since evidently the energy H depends only on J, being 
independent of w, Hamilton's equations become 

_dH_ dJ 

^-°-dT (5) 

verifying the fact that «/ is a constant of the motion; and 

dH dw 

dJ dt 


a quantity independent of time, and of w, verifying the fact that 
w increases uniformly with time. Now since w = 6/2ir, it 
increases by unity in one period, so that dw/dt is just l/T, 
where T is the period, or is v, the frequency of motion. Hence 
we have the important relation that 

giving the frequency of motion in terms of the derivative of the 
energy with respect to the action variable J. 


It can be shown in a similar way that action and angle variables 
can be introduced in general in one-dimensional periodic motions. 
In every case the w's increase uniformly with time, the frequency 
being given by Eq. (7). It also proves to be true in general 
that the action variable / is given by the area of the path of 
the representative point in phase space, which is the reason why 
it is called a phase integral. This area can be written jp dq, 
where this is analogous to jy dx, the area under the curve y{x). 
In Fig. 11, for instance, we integrate from the minimum to the 
maximum q along the upper branch of the ellipse, obtaining the 
part of the area above the q axis; then integrate back along 
the lower branch, where both p and dq are negative, obtaining 
the area below the q axis, so that the complete integral about the 
whole curve, which may be written fp dq, gives the whole area, 
or /. Connected with this is the criterion which a transforma- 
tion of the p's and q's must satisfy if it is to be a contact trans- 
formation: it can be proved that it is a transformation in which 
areas in the phase space are preserved, or are not affected by the 
transformation, though the shape of an area in the new coordi- 
nates may be very different from what it was in the old. An 
immediate result of this is that the J's are the same no matter 
what coordinates we may use for computing them. 

Angle variables can also be introduced in cases with several 
degrees of freedom, provided the motion is multiply periodic, 
by using a separate angle variable for each coordinate. It is 
evident that the method could not be used with motions which 
were not multiply periodic, for we have seen that it is only in 
the multiply periodic motions that there are quantities, as, for 
example, angular momenta, which stay constant. Yet the 
action variables, or J's, must stay constant, and consequently 
cannot be introduced, for example, in quasi-ergodic motions, 
where by hypothesis constants of the motion of this sort do not 

We shall meet angle variables and phase integrals again in 
Chap. XXX, where it is seen that they have close connection 
with the quantum theory. In that theory, the phase integrals 
prove to be quantized; that is, they take on only discrete values, 
/ being limited to integral multiples of a fundamental physical 
constant, Planck's constant h; and Eq. (7) for frequencies is 
replaced by an equation of finite differences, v being a difference 
of energy in two energy levels, divided by the corresponding 


difference of J (which can be simply h). These two formulas, 
which we elaborate later, form the basis of much of quantum 

60. Methods of Solution for Nonperiodic Motions. — When 
we meet a problem whose solution is quasi-ergodic, we are facing 
a branch of mathematics which offers no explicit or exact solu T 
tions. The only solutions are in the form of various series, 
methods, for instance by the method of perturbations, which 
can be used if the motion is almost multiply periodic. We 
indicated an example of this in discussing the two-dimensional 
oscillator, where we treated the problem as a Lissajous figure with 
slowly varying amplitudes and phases. In general, the method 
of perturbations consists in developing the various quantities 
which appear in the problem in power series in the small quan- 
tities measuring the deviation from the multiply periodic case. 
If, for instance, that case has been discussed by the method of 
angle variables, we regard the J's as slowly varying functions 
of time, their rate of variation being proportional to the first 
order to the magnitude of the perturbation. But in all these 
methods there is great difficulty in the matter of the convergence 
of the series; as time goes on, or as we consider larger and larger 
perturbations, they converge worse and worse, as is natural from 
the physical fact that often a slight change in initial conditions 
may, after the lapse of enough time, cause a profound change 
in the motion. These difficulties, as well as these methods of 
solution, are met particularly in celestial mechanics. 


1. Given a linear oscillator of mass m, frequency v, displacement x, 
momentum p, we can introduce a new coordinate w and momentum /, by 
the transformation 

x = s/j /2ir 2 mv cos 2irw 
p = — s/lmJv sin 2irw. 

This change of variables can be shown to be a contact transformation. Find 
the Hamiltonian in terms of the new variables, by substituting these values 
of x and p in the total energy. Show that this resulting Hamiltonian 
depends on J alone, being independent of w, and show that w is an angle 
variable. Verify that J is the phase integral, or area enclosed by the orbit 
in the phase space, and that v = dH/dJ. Show the geometrical interpre- 
tation of the contact transformation in the phase space. 

2. An electron of charge —e, mass m, moves about a nucleus of charge Ze, 
and very large mass. The potential energy is —Ze*/r. Assuming Ihe 
energy to be E, angular momentum p g , separate variables, and consider 


the radial motion as a one-dimensional problem, as in Chap. VII. Take a 
two-dimensional phase space in which r and p r are variables, and plot the 
path of the representative point in this space. 

3. Find the area of the path of the representative point in Prob. 2, and 
show that it is \/2ir 2 mZ 2 e 4 / ( — E) —2irp g . Set this equal to J r , the action 
variable connected with the radial motion. Find the energy in terms of 
J r , and by differentiation find the frequency of motion. Verify this result 
in the special case of circular motion, where you can compute the rotational 
frequency by elementary methods. 

4. If F x = —ex, F y = —ky, prove by direct calculation that the force, 
regarded as a vector, is at right angles to the equipotential. Show that 
the force is not in the direction of the displacement. 

5. Suppose in a two-dimensional oscillator that the force constants along 
the two axes are only slightly different from each other. Prove that the 
orbit resembles an ellipse, of slowly changing shape and size. (Hint: show 
that x = A cos (tot — a), y = B cos (cot — /3), where A, B, a, and /S are 
constants, is the equation of the ellipse. Then show that the equation of 
the path of the oscillator can be written in this form, if a. and /3 are slowly 
changing functions of time.) 

6. A particle moves as if it were executing simple harmonic motion about 
the center of a turntable, and at the same time the turntable were rotating 
with uniform angular velocity. Compute the x coordinate of the particle 
as a function of time, and show that the motion is doubly periodic. 

7. Sketch the orbits in Prob. 6, for several different ratios between the 
frequencies of oscillation and rotation, including some cases of irrational 
ratios, and also simple rational ratios, as 1/1, 1/2, 2/1. 

8. A particle moving in two dimensions is attracted by two centers, of 
the same strength, attracting with a force proportional to the inverse square 
of the distance. Compute and plot a number of equipotentials, showing 
that for some energies the motion must be entirely confined to the region 
around one or the other center, while for larger energies it can surround both 

9. A particle moves in three dimensions under the action of a force of 
attraction to a center, depending only on the distance. Set up the problem 
in spherical coordinates, using the results of Probs. 3 and 4, Chap. VIII. 
Show that the variables can be separated, so that the problem is multiply 
periodic. Show that energy, total angular momentum, and the component 
of angular momentum along the axis of coordinates, all remain constant, 
showing the connection of these quantities with the generalized momenta 
of the problem. Using the obvious fact that the motion occurs in a plane 
and is just like two-dimensional central motion in that plane, show that 
the periods of the motions in and <f> are the same, so that the motion is only 
doubly, not triply, periodic. 



In the preceding chapters we have been treating the mechanics 
of particles. Then we have passed on to the general methods of 
Lagrange and Hamilton, which can be applied to all sorts of 
mechanical problems. The present chapter will take up the 
motion of rigid bodies. 

In elementary work, one learns the main outlines of the 
problem of the motion of a rigid body. We know that its motion 
is a superposition of a translation and a rotation. There are two 
fundamental laws of motion: the force equals the time rate of 
change of linear momentum, and the torque equals the time 
rate of change of angular momentum. To make our ideas more 
precise, the translational motion generally refers to the motion 
of the center of gravity, and the rotational to rotation about the 
center of gravity. The motion of the center of gravity is essen- 
tially like the motion of a particle, which we have already treated. 
In order to leave that out in the present chapter, we shall assume 
that no net forces act, or that the body is pivoted, rotating about 
a fixed point. 

61. Elementary Theory of Precessing Top. — A torque is a 
vector, equal in magnitude to the force acting times its lever arm 
(that is, the perpendicular distance from the center of rotation 
to the line of action of the force), and at right angles to force 
and lever arm. That is, in vector notation, the torque on a single 
particle is (r X F), where r is the radius vector to the particle, 
F the force acting, and the torque on the whole body is the vector 
sum of the separate torques on its parts. Similarly the angular 
momentum is a vector, defined in an analogous way: the angular 
momentum of a particle is equal in magnitude to the momentum 
times its lever arm, and at right angles to both, so that it is 
[r X (mv)], or m(r X v), and the total angular momentum of the 
body is the vector sum of the angular momenta of its parts. We 
see then that the equation "torque equals time rate of change of 
angular momentum" is a vector equation. This results in 
having two separate sorts of effect which a torque can produce. 
For we can analyze the torque into two components, one parallel 




to the angular momentum, the other at right angles. The first 
component of torque produces an increase or decrease of angular 
momentum in the same direction as the angular momentum 
already existing; that is, it produces a speeding up or slowing 
down of the rotation, or an ordinary angular acceleration. This 
is the effect seen in the speeding up or slowing down of wheels 
on fixed axles. The component of torque at right angles to the 
angular momentum, on the other hand, produces a rotation or 
precession of the angular mo- 
mentum vector, without change 
of length, and hence a change 
in the axis of rotation. This 
is the effect considered in the 
simple theory of the symmetri- 
cal top: if p represents the 
angular momentum of the top 
at a given instant (see Fig. 14), 
which is in approximately the 
same direction as the axis of 
figure of the top, the torque of 
gravity on the top will be mgl 

n • ., j i y . Fig. 14. — Angular momentum vec- 

sm 6 in magnitude, where I is tors for precessing top . The increm ent 

the distance from the point of of angular momentum dp, proportional 
.iii . c -j. to and in the same direction as the 

support to the center of gravity. torque of gravityt changea the total 

The torque Will act at right angular momentum from p to p + dp, 

angles to the axis and p, so ^^ " a P recession througb the 
that the change of momentum 

in time dt will be dp, as shown. Thus the angular momen- 
tum after time dt will be the vector p + dp, obtained from the 
old vector by a precession, as if the whole figure were rotated 
about the axis through the angle d<f>. We can easily find the 
rate of precession. For d<i> evidently equals dp divided by the 


radius of the circle, or is 

\p\ sin 9' 

On the other hand, dp = 

j . Q j. •„- mgl sin 9 dt . d<l> mgl 

mgl sm 9dt. Hence * , . — - — = d<$>, or -^ = j~, a precession 

increasing with increasing torque, but decreasing with increasing 
angular momentum. We note that if we regard the precessional 
velocity as a vector, say w, along the vertical direction, and having 

a magnitude 


we have 


| = (»x P ). - (1) 

This is a general relation for a precessing vector, as we readily see. 

The elementary ideas of torque and angular momentum do not 
permit us to go much farther than we have indicated here, 
without further analysis. With a body in the absence of torques, 
for instance, we know at once that the angular momentum stays 
constant, both in direction and in magnitude. But this tells us 
little about the actual complicated motion. We must then 
examine the problem more in detail. In the succeeding sections 
we consider the angular momentum, kinetic energy, etc., of solid 
bodies of arbitrary shape, with arbitrary axes of rotation, though 
we always assume that they rotate about a fixed point, as the 
center of gravity. 

62. Angular Momentum, Moment of Inertia, and Kinetic 
Energy. — Let a body rotate about the origin as a center, the axis 
of rotation having direction cosines X, ,./*, v with the three axes. 
We may regard the angular velocity as a vector, whose direction 
is the axis of rotation, and whose magnitude is the magnitude of 
angular velocity. Thus, if the vector is co, its magnitude co , we 
have w x = Xco , <a y = /xco , w z = vca . Now we can easily find 
the linear velocity of any point of the body. This is numerically 
poj , where p is the perpendicular distance from the point to the 
axis of rotation, and is at right angles to the axis of rotation and 
the perpendicular distance. In other words, the velocity is given 
by the vector product (w X r); a little consideration shows that 
the vector product has the right direction. Now that we know 
the velocity of each point, we can compute the angular momen- 
tum. We have already seen that this is the sum of terms 
m(rXv) for all particles of the body. But v = oo X r, so that 
angular momentum = 2ra[r X (w X r)]. This can be easily 
expanded. Thus the x component, for example, is 

m\y(u X r), - z(co X r)„] 

= m[y(o) x y — co y x) — z{w z x — u> x z)] 

= m<a x (y 2 + z 2 ) — m<*)yxy — ma> z xz, 

with corresponding formulas for the other components. If now 
we sum over all particles of the body, remembering that co is 
the same for all, we have, if p x , p y , p e are the components of 
angular momentum, 


p x = Aa) x — Foiy — Eo) Zf 
p y = — Fo) x + B<ii y — Doi z , 

p z = ^Ecox — Da)y + Cd) 2 , (2) 

where for abbreviation we set A = Hm(y 2 + z 2 ) r B = 2ra(z 2 + z 2 ), 
C = Sm(i 2 + y 2 ), Z> = Zmyz, # = Zmzx, F = Zmxy. The 
quantities A, B, and C are called the moments of inertia, and 
D, E, F are the products of inertia; the first three are obviously 
the moments of inertia of the body in the ordinary sense, about 
the x, y, and z axes, respectively. We note one thing at the 
outset: the angular momentum vector is not in general parallel 
to the angular velocity vector. Thus if co y = co z = 0, so that 
the angular velocity is along the x axis, we have all three com- 
ponents of p in general different from zero. 

Next we find the kinetic energy. For a single particle, this 
is fmz> 2 , or |m(co X r) 2 . Again expanding, this is 

\m[{o3 y z — w g y) 2 + (o) z x — w x z) 2 + (a> x y — a) y x) 2 ] 

= hm[<* x \y* + z 2 ) + V(* 2 + * 2 ) + «. 2 (* 2 '+ V 2 ) 
— 2o3 x o} y xy — 2u) y o) z yz — 2<a z (a x zx]. 

Summing over all particles, and using the abbreviations above, 
this is 

T = \(Ao> x 2 + Boi y 2 + CV - 2Duy<a, - 2Ea> z a> x - 2Fa> x a, y ). (3) 

The quantity T can be written as %Ia>o 2 , where 

I = \ 2 A + n 2 B + v 2 C - 2fivD - 2v\E - 2\pF. ' (4) 

It is easily shown that / is simply 2wp 2 , where p, as before, is 
the perpendicular distance from the point to the axis of rotation, 
so that I agrees with the elementary definition. If we imagine 
X, fi, and v varied in any manner, the quantities A, B, . . . F 
do not change. As a variation in \ n, v means a variation of 
direction of the rotation axis through the center of rotation 0, 
we see that the sums A . . . F completely determine the moment 
of inertia of the body about any axis through the same center of 

63. The Ellipsoid of Inertia; Principal Axes of Inertia. — The 
Eq. (4) for the moment of inertia / may be interpreted geometri- 
cally in a very simple manner. The equation Ax 2 + By 2 + 


Cz 2 — 2Dyz — 2Ezx — 2Fxy = constant represents a surface 
of second degree. If we denote by r the radius vector drawn 
from to a point on this surface, having direction cosines X, n, v, 
this equation becomes 

r 2 (A\ 2 + Bn 2 + Cv 2 - 2ZV - 2Ev\ - 2F\n) = constant. (5) 

The expression inside the parentheses is just I, the moment of 
inertia, so that we have r 2 = constant/7, or I = constant/r 2 . 
Now, since the moment of inertia is always positive and can 
never vanish, r 2 cannot become infinite and our surface is a closed 
surface. Since it is of second degree it is an ellipsoid with its 
center at 0, and is called the ellipsoid of inertia at the point O. 
The ellipsoid of inertia has the simple physical significance that 
the moment of inertia of the body about any axis through is 
measured by the inverse square of the radius vector from 0, 
drawn parallel to the rotation axis and terminating on the surface 
of the ellipsoid. 

Every ellipsoid has three principal axes which are mutually 
orthogonal. These axes are known as the principal axes of 
inertia at 0. Just as in the case of an ellipse, when coordinate 
axes are chosen coincident with the principal axes, the equation 
of the ellipsoid reduces to a sum of squares, so that the coefficients 
of the terms in yz, zx, and xy disappear and we have D = E = 
F = 0. We shall often use coordinate axes coincident with the 
principal axes, but since these axes are fixed with respect to 
the rigid body, we must always remember that they are rotating 
axes in space, and we must describe their motion with respect to a 
system of axes fixed in space. Referred to the principal axes, 
the moment of inertia becomes simply \ 2 A + n 2 B + p 2 C , 
where these A , Bo, and C are now computed with respect to 
axes fixed in the body, and so do not change with rotation of the 
body, as do the ordinary moments and products of inertia com- 
puted with respect to fixed axes. The kinetic energy of rotation 
is then T = i/co 2 = \(\ 2 oo 2 A + ^u 2 B + v 2 a 2 C ), which is 
also T = %(Aqui 2 + Bqwi 2 -f- CW3 2 ), where «i, co 2 , and co 3 are 
the components of co taken about the principal axes. 

64. The Equations of Motion. — Suppose the moment, or tor- 
que, of the external force is M, with components M x , M y , M z . 
Then the equations of motion are obtained by setting the torque 
equal to the time rate of change of angular momentum: M —■ 
dp/dt, or for the x component - ■.••-■ 


M x = j t (Au x - Fco y - Eo> z ), (6) 

where, of course, we are using arbitrary x, y, z coordinates, not 
the principal axes. In performing the differentiation, we must 
remember that not only are a) x , a> y , co z changing, but also A, F, E, 
since the body is rotating, and these moments and products of 
inertia are defined with respect to a particular fixed coordinate 
system. Thus we have 

M x = Aw x — F6) y — E6o z + Ao> x — Fu y — Eu z . 
The last three terms can be rewritten, using, for instance, 

along with v = a> X r, so that x = o> y z — u> z y, etc. Without 
trouble we find that the equations can be written 

M x = A6> x — Fw y — Eu z — (B — C)co y co z — D(w y 2 — o> 2 2 ) + 

Fo}xO) z — Ecdx&y, (7) 

with equivalent equations for the y and z components. The 
latter terms seem very complicated ; but we readily see that they 
can be written as a vector product, giving 

M x =^ = A6> x - F6>y - E6> z + (co X p) x (8) 

The equation for time rate of change of angular momentum, 
in the form above, has a simple interpretation. Suppose we 
have any vector G, and that we consider it with respect to rotat- 
ing coordinates, rotating with the angular velocity co. If we 
were rotating with the coordinates, the vector would seem to 
have a certain time rate of change, which we may call dG/dt. 
But this will not be its actual time rate of change, when looked 
at from a stationary system of coordinates. For even a vector 
which remained constant in the rotating system would actually 
be changing, just on account of its rotation. In fact, the rate 
of change of the vector for this latter reason, using the same sort 
of argument which we met in Eq. (1) in describing the precessing 
top, is (o> X G), and the total rate of change of G is the sum of 
these two effects, or 

*-£ + <»*<»• W 


In particular, then, with the angular momentum, we evidently 
have two terms of the sort considered above. We conclude 
therefore that 

A(j) x — Fo3 v — Ed), = ( — — 

= \m)j 

so that these terms represent the rate of change of angular 
momentum, with respect to the rotating axes. 

One result of the theorem we have just worked out is interest- 
ing. Let the vector G be the angular velocity. Then doo/dt = 
du/dt, since the vector product (w X «) is zero. Hence the 
components of time rate of change of angular velocity are the 
same in fixed as in rotating axes. 

65. Euler's Equations. — The equations of motion, (7) or (8)> 
take on a particularly simple form when expressed in terms of 
the principal axes. Let us first take our fixed axes xyz so that 
they coincide with the instantaneous values of the rotating, 
principal axes. Then D, E, F are instantaneously zero, and the 
equations (7) are 

M x = A Ux ~ (-So — Co)o)yO) z , 

with two similar equations. But now let coi, w 2 , o> 3 be the com- 
ponents of angular velocity with respect to the rotating principal 
axes. Momentarily these equal b I( u y , &> z . But also o>i is the 
same thing as (da)/dt) x , the x component of the time rate of 
change of angular velocity with respect to the rotating axes. 
We have just shown, however, that this equals (du/dt) x , or 
u x . Hence we can rewrite our equations entirely in terms of 
the moving axes, 

Mi = A 6>i — (Bo — Co)co 2 co 3 

Mi = f?oO>2 — (^0 — A )cO 3 Wi 

Ms = C a> 3 — (A Q — jBo)wico 2 , (10) 

where Mi, Mi, M z are the components of torque with respect 
to the rotating axes. These equations are called Euler's equations. 

66. Torque-free Motion of a Symmetric Rigid Body.— We 
shall now apply Euler's equations to the motion of a rigid body 
symmetric about an axis, subject to the action of no external 
torques (either the external forces are zero, or act at the center 
of mass). The earth provides a good example, if we neglect 
the torques due to sun and moon. We choose the center of mass 


as an origin, and take the axis of symmetry as principal axis 
3. The principal moments of inertia are then A . A , Co. 
Euler's equations for this case are 

A 6)i -f- w 2 g>3((7o — A ) = 

A O)2 + O}lO} 3 (A — Co) = 
C O>3 = 0. 

The last equation integrates at once, giving &> 3 = constant. This 
means that the resultant angular velocity has a constant com- 
ponent along the axis of symmetry. If we now place a = 

C — A 
cos - A j the two other equations are coi + aw 2 = and 

o> 2 — ouai = 0. Differentiating the first of these, we find oil + 
ao> 2 = oil + a 2 o)i = 0, which has as its solution cai = a cos 
(at + e), and putting this value of a>i in the second equation 
we find co 2 = a sin (at + c), where a and c are integration con- 
stants. From these equations we see that the resultant angular 
velocity <o = vW + o> 2 2 + « 3 2 = vV + « 3 2 is constant, and 
that the projection of a> on the plane perpendicular to the axis 
of symmetry and fixed in the body describes a circle of radius a 

with a period given by r = — = — ^ ^— In the case of 

a co 3 Co — ^.o 

the earth, « 3 = 2ir per day, so that r becomes A /(C Q — A ) 
days, which is about 300 days and is known as the Euler period. 
This period is not observed, but there is one of 427 days known 
as the Chandler period giving rise to a variation of latitude. 
When the imperfect rigidity of the earth is taken into account, 
it is possible to identify these two as the same. 

We can get an idea of the actual motion most clearly from a 
diagram. In Fig. 15, we show an oblate spheroid, to represent 
the symmetrical body. There is a circular conical hole Obd 
cut out surrounding the north pole a, and a fixed cone touching 
the inside of this hole, and centered on the line Oc. The motion 
is now as if one cone rolled on the other. We see at once that, 
since the axis Ob is instantaneously at rest, it is the instantaneous 
axis of rotation co. As time goes on, this axis of rotation traces 
out the cone Obd with respect to the body, and at the same time 
traces out the cone Obe fixed in space. The axis of the fixed 
cone, Oc, is the direction of the constant total angular momen- 
tum vector. Other properties of the motion are discussed in a 


67. Euler's Angles. — If we wish information about the general 
motion of the top, we must introduce some set of coordinates 
capable of describing its position. So far, we have not had any 
set of coordinates at all. We have worked with angular velocities, 
and angular momenta, which were vectors, and all the equations 
came out very neatly and symmetrically in terms of them. But 
there is a peculiar thing about the three components of angular 
velocity: there are no corresponding angles to serve as coordi- 
nates.. This is not true in plane motions. If a body rotates 

Fig. 15. — Space and body cones for the torque-free rotation of a symmetrical 
body. The cone Odb, fixed in the body, rolls on the cone Obe, fixed in space. The 
line Oa is the axis of symmetry of the body, Ob is the instantaneous axis of rotation, 
Oc the fixed axis of total angular momentum. 

with angular velocity u about a fixed axis, we can regard co as 
0, where is the angle through which the body has turned about 
the fixed axis, and which can be used as a coordinate. Then we 
can say that the component of angular momentum Jw is the 
momentum conjugate to 0, and the whole Lagrangian and Hamil- 
tonian methods go through perfectly. As soon as we have three 
dimensions, however, and the possibility of different axes of 
rotation, we no longer have such angles. It is readily seen, for 
instance (we leave it for a problem), that one cannot use the 
angles through which the body has turned about the three 
coordinate axes as variables. The fact is that, though angular 
momentum is a vector, finite angular rotations are not, and do 
not have three components which can be used as coordinates. 



We are forced, then, by the peculiar nature of angular rota- 
tions, to look for some set of three angles to describe the position 
of the body, which unfortunately cannot have the symmetrical 
nature of the x, y, z components of angular velocity. The usual 
set of angles are called Euler's angles, and are shown in Fig. 16. 
We ordinarily use these angles for discussing a symmetrical 
body. Then Oz is a fixed axis, for example, the vertical in the 
top problem. OC is the axis of figure of the body, taken as the 

Pig. 16. — Euler's angles. For a symmetrical body, OCo is the axis of sym- 
metry, OAo and OBo two axes fixed in the body at right angles. and </> measure 
colatitude and longitude of the direction of the principal axis; 4/ measures the 
rotation of the body about the principal axis. 

third principal axis. 6 measures the angle between axis of figure 
and fixed axis, <j> measures the angle of precession of the axis 
of figure, so that d<t>/dt is the angular velocity of precession, and 
if/ measures the rotation of the body about its axis of figure 
measured from the line ON, called the nodal line. Thus we see 
that, though the Eulerian angles do not have symmetry, they are 
very natural ones for the problem in hand. 

Let us set up the components of angular velocity, and the 
kinetic energy, in terms of the Eulerian angles. The motion 
of the body may be thought of as consisting of a rotation of the 
body about OC and the motion of OC relative to the fixed 
frame of reference. The former is described by the angular 
velocity 4> which has the components 0, 0, ^ (referred to the 


principal axes). The latter motion consists of (a) a rotation 6 
about the nodal line ON as an axis, which was zero in the steady- 
motion of the top considered above; and (6) of a precession <f> 
about the z axis. The components of these angular velocities 
along the principal axes OA , OB , and OC are 

(a) 6 cos \p; — 6 sin yp; 

(6) <$> sin sin $; <j> sin cos ^; <j> cos 0. 

Adding these angular velocity components, we have 

coi = cos ^ + 4> sin sin \p 
co 2 = — sin \f/ + sin cos i/' 
W3 = tj/ -\- <f) cos 0. 

Here $ corresponds to the quantity a) used for discussing the 
rcteady motion, and 4> to «i. From these components of angular 
velocity, of course, we can at once get the angular momenta. 

The kinetic energy, as we have seen, is i(A wi 2 + # a> 2 2 + 
C co 3 2 ). But in our case of a symmetrical top this simplifies, 
since A = B , and substituting we have 

T = i[A («i» + w 2 2 ) + C co 3 2 ] = iUo(0 2 + sin 2 6 tf) + 

C (^ + 4> cos 0) 2 ]. (11) 

Using the kinetic energy, or corresponding Lagrangian function, 
in terms of the Eulerian angles, we can easily derive the Lagrang- 
ian equations of motion, and find them to be the same Euler 
equations which we have already obtained. For instance, 

using L = T, |(|) - % = |[Ctf + + cos «)] = C„co 8 = M 3 , 

which is the third of Euler' s equations, when we remember that 
A Q - Bo = 0. 

68. General Motion of a Symmetrical Top under Gravity — 
We are now ready to proceed with the general discussion of the 
top under gravity, for which we have already considered the 
steady precession. We note first that the torque is at right 
angles to the axis of figure. Hence by the third of Euler's 
equations, co 3 = 0, or « 3 is constant. Instead of using the other 
two of Euler's equations, it is somewhat more convenient to 
use the conservation of energy and of angular momentum to 
discuss the motion, much as we did in our earlier chapter on 
central motion. For the kinetic energy we have T = %[A (P + 


sin 2 <£ 2 ) + C co 3 2 ], and the potential energy is Mgl cos 0, where I 
is the distance from to the center of mass. Thus the energy- 
equation becomes 

E = UMO 2 + sin 2 0<£ 2 ) + CW) + Mgl cos 0, (12) 
where i? is the total energy. We now can eliminate <f> from the 
equation above by utilizing the fact that there are no torques 
taken about the z axis. This means that the component of angu- 
lar momentum along this axis is constant. The angular momen- 
tum due to the rotation o> 3 of the top about its axis of. symmetry 
has a vertical component C w 3 cos 0. The angular velocity of 
the axis contributes nothing to the vertical angular momentum. 
The other component of the angular velocity of the axis is sin 
<f>, and this is about an axis perpendicular to OC, making an 
angle of t/2 — with the vertical. Thus the contribution of 
this term is A sin 2 00, so that the conservation of angular 
momentum about the z axis yields 

Vz = C0W3 cos + A sin 2 0<£. (13) 

We now substitute the value of <f> taken from this equation into 
the energy equation and get a differential equation for alone, 
so that we may discuss the time variations of 0, or the variations 
of the inclinations of the axis of figure of the top with the vertical. 
When we make this substitution, and solve for 0, we have 

= V2(E - V'y/Ao, (14) 

where V, which plays the part of a fictitious potential energy for 
the motion of this coordinate, has the value 

The first term is the gravitational energy, decreasing as the angle 
increases, showing that gravity tends to make the top fall. The 
second is a constant, the energy of the spinning motion. The 
third term is a dynamic term, reminding us of the centrifugal 
force term in the effective energy for the radial motion in a 
central field. It becomes infinite when = or t, since at those 
angles the rate of precession <j> would have to be infinitely rapid 
in order to conserve the angular momentum component p e , 
contributing therefore an infinite amount to the energy. 
Between these angles this dynamic term has a single minimum. 
In other words, it exerts a stabilizing influence, quite apart from 



any external forces which may act, and leads to a stable oscilla- 
tion of about a certain minimum of V, whose position is 
determined by the external torque. 

69. Precession and Nutation. — The minimum of V can be 
determined by differentiating with respect to 0, setting the result 
equal to zero. This gives 
= -Mgl&m + 

(^-CoCO3COS0) | CoCO3 gin Q _ Aa cog e gin e 


Pz — C0W3 cos 6 
A sin 2 


A sin 2 

or, from Eq. (13), 

<j>[C rp - (A - C )<j> cos 0] = Mgl. (16) 

If the energy is equal to the effective potential V at this angle, 
will be zero, and the motion is a pure precession of the sort 
described in Sec. 61. If we assume that the rate of precession 

is small compared with the rate of 
rotation, which is the only case in 
which the angular momentum, the 
angular velocity, and the axis of fig- 
ure are nearly enough in the same 
straight line so that the arguments 
of that section are valid, we have 
<j> < < ip. In that case the equa- 
tion becomes 4>(C4d = Mgl, <j> = 
Mgl/Ctfp, m agreement with the re- 
sult of Sec. 61, when we recall that 
in this limit Cot is approximately the 
total angular momentum. This 
condition, or rather the accurate con- 
dition (16), determines the rate of 
steady precession <f> for any total 
sphere. 0i and 2 are angular angular momentum, a rate independ- 

limits of the mutational motion. ^ of Q t() thig approxima tion, but 

depending on if we must consider terms in <j> 2 . 

If the energy E is greater than the minimum of V , the curve 
of E will cut that for V at two values of 6, one greater and one 
less than the inclination of the axis for the purely precessional 
motion which we have just discussed. In this case, will oscillate 
between these two limits. This oscillation is called nutation. 
The complete motion then consists of a combination of this 
nutation with a precession, as indicated in Fig. 17, where we draw 

Fig. 17. — Nutation of a top. 
The sinusoidal curve is the pro- 
jection of the axis of the top on a 


the intersection of the axis of the top with a sphere. The angles 
0i and 2 are the two angles for which E = V, so that thi; 
minimum of V, or the angle for the pure precessional motion 
corresponding to the same angular momentum, lies between these 
two values. In the problems, the frequency of the nutational 
motion is discussed. We also discuss, in Prob. 9, the special 
case of the "sleeping" top, in which the top starts spinning 
vertically. In this special case, the dynamic term in V is finite 
at 6 = 0, so that under certain circumstances oscillations about 
the vertical can occur. 


1. Prove directly that the moment of inertia /, equal to Swp 2 , is equa] 
to \ 2 A + n 2 B + v 2 C - 2nvD - 2v\E - 2\ M F, where X, ft, v, are the direc- 
tion cosines of the axis of rotation. 

2. Show that, if T is the kinetic energy of a rotating body, p its angular 
momentum, w its angular velocity, p x = dT/dw x , and 2T = p x w x + p yWy + 


3. In Fig. 15, show that tan AaOb = a/m, and tan AaOc = =-° — , where 

Co 0>3 

a>s, a represent the components of the angular velocity along and at right 
angles to the figure axis. Knowing that the time required for the axis 06 
of angular velocity to perform a complete rotation with respect to the body 

2x Ao , . . . 
18 T = &T C — A ' S * time for rt to P er f orm a complete rotation in 

*? A 
space is approximately — ^ if angles aOb and aOc are small. Hence show 

0>3 <^0 

that for the earth the axis of angular velocity is not fixed, but rotates about 
a fixed direction approximately once a day. 

4. The earth is acted on by torques exerted by the sun and moon, and as a 
consequence its angular momentum precesses about a fixed direction in 
space. This is entirely separate from the effect of Fig. 15 and Prob. 3, 
which we now neglect. This precession has a period of 25,800 years, and 
carries the angular momentum about a cone of semi-vertical angle 23° 27', 
so that the pole in succession points to different parts of the heavens, result- 
ing in the precession of the equinoxes, and in the fact that different stars 
act as pole star at different periods of history. Show that the motion can 
be represented by the rolling of a cone fixed in the earth, of diameter 21 in. 
at the north pole, on a cone of angle 23° 27' fixed in the heavens. 

6. A system of electrons moving about a center of attraction has a certain 
angular momentum, equal to 2ra(r X v), and also a magnetic moment, 

equal to ^^(r X ^' where e is the charge and m the mass of an electron, 
c the velocity of light. This magnetic effect results because the electrons 
in rotation act like little currents, which in turn have magnetic fields like 
bar magnets. An external magnetic field H exerts a torque on the system, 
equal io thd vector product of the magnetic moment and H. Show that 



under the action of the field, the system of electrons precesses with angular 
velocity eH/2mc about the direction of the field. This precession, which, 
as we see, is independent of the velocities of the electrons, is called Larmor's 

6. One reason why finite rotations do not act as vectors is that they do not 
commute, that is, the same two rotations applied in one order lead to one 
answer, but in the opposite order to a very different answer. Demonstrate 
this by diagrams, imagining that we have a cube (label its faces by different 
letters or numbers on a diagram), originally in one position, with its edges 
parallel to the coordinate axes (position a). First rotate through 90 deg. 
about the x axis (position 6), then through 90 deg. about the y axis (position 
c), drawing diagram^ of each step. Then, starting again from position a, 
rotate first through 90 deg. about the y axis (position d) and then through 
90 deg. about the x axis (position e). . Show that (c) and (e) are entirely 
different orientations. 

7. Write down the kinetic energy of a nonsymmetrical body, in terms of 
Euler's angles. Derive the Lagrangian equation for \p, and show that it 
reduces to one of Euler's equations. 

8. In the same way as in Prob. 7, set up the other two Lagrangian equa- 
tions, showing that they lead to the other two of Euler's equations. 

9. A top is started spinning vertically, with no other motion, so that 
initially 0=0, dd/dt = 0. Show that p z = CWa, E = £CW 3 2 + Mgl. Sub- 
stituting these in the expression of Eq. (14) for 0, show that if w 3 > w', where 
( w ')2 = 4Mgl Ao/Co 2 , the angle must remain equal to zero, but that if « 3 
falls below a', will oscillate between and the angle cos -1 (2(w 3 /«') 2 — 1]. 
Experimentally, if a top is started as we have described, with « 3 > w', there 
will be a frictional torque decreasing w 3 , and as soon as the torque reduces 0)3 
below a/, the top will begin to wobble. 

10. For a nutation of small amplitude about the steady precessional 
motion of a top, the angle oscillates sinusoidally about the equilibrium 
angle. Find the frequency of the nutation, by expanding the potential V 
in power series in - 0o, where O is the angle of steady precession with the 
same angular momentum. Retain only the constant and the term in 
(6 — 0o) 2 , and get the frequency by comparing with the corresponding 
expression for the linear oscillator. 


The mechanical problems which we have treated so far have 
been those where just one particle moved around, sometimes in a 
potential field, sometimes subject to forces not derivable from a 
potential. In many problems, however, there are several 
particles exerting forces on each other and influencing each other's 
motion. As examples, we have the actual solar system, where 
the sun, planets, and moons all act on each other; an atom, with 
the various electrons reacting; a molecule, with the atoms vibrat- 
ing under the action of their mutual forces. A more familiar 
case is that of several electric circuits coupled together by 
induction or some other method. Another is that in which 
several pendulums or springs can react on each other, as through 
their supports, and affect each other's motion. There is evi- 
dently a very wide variety of problems; we shall treat only the 
simplest, in which two linear oscillators, or electric circuits, are 
coupled together by a force depending linearly on both' the 

70. Coupled Oscillators.— Suppose we have two undamped 
one-dimensional oscillators, whose displacements are y x and y 2 
respectively, and whose equations of motion, if uncoupled would 

mx ~w + kiyi = °» 
m 2 -j^ + k 2 y 2 = 0. 

Now let them be acted on by equal and opposite force* propor- 
tional to the distance apart, -a(y x - y 2 ) and -a(y 2 - Vl ) 
respectively, as if there were a spring stretched between them' 
1 he equations then become 

mx ~W + ( kl + a )2/i - a v* = o, 

mT W + ^ 2 + a )?/2 - ay x = 0. 


As a matter of convenience in the calculation, we shall introduce 
changes of notation: let yis/m~i = x h y 2 \/m 2 = x 2 , (&i + a)/m 1 = 
coi 2 , (fc 2 + a)/m 2 = co 2 2 , a/Vmm 2 = c. Then the equations are 

— ^ + COi 2 ^! - 'cx 2 = 0, 

^? + co 2 % 2 - czi = 0. (1) 

These are two simultaneous differential equations, and there are 
several ways of solving them. First we may take advantage of 
their property of being linear with constant coefficients, and see 
if we cannot get exponential solutions. We assume x x = Ae iut , 
X2 _ Be iut , where A, B, co are to be determined. Substituting, 
we have 

(-CO 2 + C0! 2 M - cB = 0, 

-cA + (-co 2 + co 2 2 )J5 = 0. (2) 

If we regarded co as being known, these would be two simultane- 
ous equations for the two constants A and B. Evidently they 
are linear homogeneous equations. Now it is a theorem of 
algebra that in general two such equations do not have any 
solutions, unless the determinant of coefficients, 

(-CO 2 + CO! 2 ) ~C 

— C (-0) 2 + co 2 2 ) 


is equal to zero. Let us see what this means. We could solve 
the first equation for A in terms of B: A = Bc/(-a) 2 + coi 2 ). 
But we could do the same with the second, A = B{- co 2 + o> 2 2 )/c. 
If these solutions are to be consistent, it must be that the two 
factors on the right are equal, c/(-co 2 + coi 2 ) = (-co 2 + co 2 2 )/c, 
or (-0,2 .f ^^(-^ + W2 2) _ c 2 = o. But this is just the equa- 
tion obtained by setting the determinant equal to zero, so that 
we have verified the result of algebra. Now the equation which 
we have obtained, called the secular equation, can be satisfied, 
for we still have co at our disposal. Solving the quadratic, this 

^ _ oMHp* ± ^ (».» - ^y + cK (4) 

This gives two values for co 2 , or two different possible frequencies 
of motion for the system. This is natural, since we should have 


two frequencies if they were uncoupled, one for the one particle, 
the other for the other. Suppose the first, with the + sign, is 
called a/, and the second, with the — sign, co". It is interesting 
to find co' and co", in the case where c, measuring the interaction 
between the particles, is small. Then we can expand by the 
binomial theorem, obtaining 

"' 2 = CO! 2 + 2 ^ 2 + • • • , 

Uf — 0> 2 

C/' 2 = C0 2 2 + -y^ 2 + • • • , (5) 

0>2 — COi 

showing that the frequencies approach the natural frequencies 
of the separate systems when the coupling goes to zero, but that 
they differ from them by quantities which increase as c increases. 
It is interesting to see that the frequencies are always spread 
apart by the interaction: if coi 2 > co 2 2 , then a/ 2 > co x 2 , co" 2 < co 2 2 , 
and correspondingly if the situation is reversed. There are 
several relations between co' and co" which we shall need, and 
which we write for reference; they are easily proved from the 
solutions already found, and hold independently of the size of c : 

Co' 2 Co" 2 = £Oi 2 C0 2 2 — C 2 
<0'2 _f_ w "2 = Wl 2 + ^ 

(-a/ 2 + cox 2 ) (-a/' 2 + cox 2 ) = -c 2 . (6) 

Having determined the two possible frequencies of vibration 
of the system, we next find the amplitudes A' and B' correspond- 
ing to co', and A" and B" corresponding to co". These are evi- 
dently given by 

4L = c 

B' (-co' 2 + COl 2 / 

*L = 1 (7) 

B" ("CO" 2 + CO! 2 ) ^ } 

That is to say, the ratios of A's to B's are determined, but not 
the values themselves. The situation is then the following: we 
have one possible solution, x x = A'e™' 1 , x 2 = B'e^' 1 , where the 
ratio of the amplitudes of x\_ and x 2 is fixed, but the magnitudes 
are otherwise arbitrary. Of course, there is a similar solution 
with — ico't in the exponent, so that combining these in the usual 
way we have an arbitrary phase and amplitude, or two arbitrary 
constants. Next we have also the solutions .Ti = A"e iu "\ r 2 = 
B"e io,/ ", of the same sort. And now, on account of the linear 


nature of the equations, we can make linear combinations of 
these, obtaining 

X! = A'e™'* + A"e ia " 1 , 
x 2 = 5V-" + B"e*"*. (8) 

That is, each coordinate has two periods in its motion, or is 
doubly periodic. Since the amplitudes are to a certain extent 
arbitrary, it is possible for only one frequency to be excited at a 
time, or for both to go simultaneously. 

It is interesting to consider the physical nature of the motions 
described by these equations. Let us assume that the two sys- 
tems are only loosely coupled together (c is small). Then one 
possible mode of vibration has frequency co', only slightly greater 
than the frequency coi which the first oscillator would have had 
without coupling. It is not a vibration of the first oscillator 
alone; both are vibrating at the same time. However, if we 
examine the coefficients A', B' in this case, we find that B' is 
small compared with A', meaning that the amplitude of the 
second oscillator is small compared with that of the first. Thus, 
using B'/A' = W - o> ,2 )/c, and co' 2 = o>i 2 + c 2 /W - co 2 2 ) + 
• • • , we have approximately B'/A' = c/(co 2 2 — coi 2 ). This is 
as if the first oscillator, vibrating with frequency co', which is 
approximately co x , and amplitude A', were forcing the second 
oscillator by virtue of the coupling, with a force cx h or cA'e™' 1 , 
or approximately cAV* 1 '. This would produce a forced ampli- 
tude of (cA' e ic "'0/(«2 2 — u' 2 ), which is just what we have found. 
Similarly the second oscillator can vibrate almost by itself, 
with the frequency co" which almost equals co 2 , but it reacts back 
on the first and produces a small forced amplitude. It is now in 
the further approximations to the interaction that the differences 
between coi and co', co 2 and co", come in. 

We have considered the types of vibrations separately. But 
there is no reason why both cannot be simultaneously excited, 
so that each particle will be vibrating with both periods at once. 
Then the phenomenon of beats can easily come in; for the sum 
of two sinusoidal vibrations of different frequencies is equivalent 
to a single vibration of varying amplitude, as we see from the 

/ w / _ w " \ „' + co", 
cos o't + cos a"t = (2 cos ^ t ) cos 2 ' 

where the first expression, in parentheses, represents an amplitude 


oscillating with the slow frequency («' — u")/2, and modulating 
the latter term, a rapid vibration of frequency (&>' + <o")/2. 
If o}' and o}" are approximately equal, the effect gets most 
marked, the frequency of the beats approaching zero. There is 
in this case a pulsation of amplitude and energy from one of the 
oscillators to the other. This is often seen in other similar 
problems. Thus, if a weight is hung from a spiral spring and is 
set vibrating up and down, it will be observed that after a certain 
lapse of time the vertical motion will decrease, but there will be a 
torsional motion of considerable amplitude. As time goes on, 
these two forms of motion will alternately take up large ampli- 
tudes. The reason is that there is a coupling between the two 
forms of oscillation, and the beat phenomenon we have just 
described comes into play. 

71. Normal Coordinates. — We have just seen that the general 
oscillation of two coupled particles is a sum of two vibrations of 
different frequencies. If only one of these vibrations is excited, 
both particles oscillate with the same frequency but different 
amplitudes. It now proves to be possible to introduce new 
coordinates X and F, called normal coordinates, given by linear 
combinations of the displacements xi and x 2 of the two particles, 
which have the following properties: the generalized force 
acting on X is proportional to X alone, independent of F, so 
that the equations of motion are separated, and X and F execute 
independent simple harmonic vibrations, of different frequencies. 
When one of the coordinates alone is different from zero, the 
other remaining equal to zero, just one of the two vibrations is 
excited. The existence of such coordinates is made plausible 
by the following fact: if one vibration alone is excited, x x is 
proportional say to a times a sinusoidal function of time, x 2 to j8 
times the same sinusoidal function. In this case jSzi — ax 2 will 
be always zero. This linear combination of xi and x 2 will be 
proportional, then, to the normal coordinate associated with the 
second type of vibration, which is not excited in the case men- 
tioned. By assuming that the second vibration alone is excited, 
we can in a similar way infer the form of the first normal coordi- 
nate. We proceed in the next paragraph to the general formula- 
tion of the normal coordinates. 

Suppose we set up quantities X, F, defined by the equations 
Xi = a'X + a"Y, 
x 2 = 0'X + 0"Y, (9) 


where Co! = A', Cfi f = B', Da" = A", Dp" = B", C and D 
being constants. Since only the ratios of the a's and /3's are so 
far determined, we may demand that the magnitudes be so 
fixed in this case that a' 2 + 8' 2 = 1, a" 2 + 6" 2 = 1. This is 
called the condition of normalization, and we shall see its signifi- 
cance a little later. Our quantities X and Y can now be treated 
as generalized coordinates, and we can easily see that the equa- 
tions of motion, in terms of them, have the variables separated. 
Let us set up the equations of motion in these new variables. 
We have 

= 1 

~ 2 
= 1 

~ 2 

Using the relations (6) and (7), the last term can be shown to be 
zero. This is called the condition of orthogonality, for reasons 
which will later be evident. Using the normalization conditions 
mentioned above, we have finally 

T = - 

1 2 

\dt) + \dt) 


Next for the potential energy we have, from the original 

V = K"iW + co 2 2 X2 2 - 2cxix 2 ), 

= M(«i 2 (V* + oc"Y) 2 + co 2 2 (/3'X + p"Y) 2 

-2c(a'X + a"Y)(8'X + 8"Y)} 
= \ { ( Wl V 2 + o>2 2 /3' 2 - 2ca'8')X 2 
+ ( Wl V 2 + co 2 2 /3" 2 - 2ca"8")Y 2 
+ 2Wa'a" + a> 2 W - c(a'jS" + a"8')]XY). 
Here it can be shown by a little manipulation that the first 
parenthesis equals w' 2 , the second co" 2 , and the third is zero, so 

v = K*>' 2 ^ 2 + «" 2 F 2 ). ( n .) 

In terms of the new variables, the variables are separated, and 


Lagrange's equations become simply d 2 X/dt 2 + <a' 2 X = 0, 
d 2 Y/dt 2 + <a" 2 Y = 0, whose solutions are X = constant X e iu/t , 
Y = constant X e ia " 1 . Thus each of the generalized coordinates 
executes a simple harmonic motion, which of course can have 
arbitrary amplitude and phase, and our final result, if we set the 
first constant equal to C, the second to D, is 

xi = a'X + a"Y = a'iCe^' 1 ) + a"(De*"'"') = A'e^" + A'V-'", 

etc., agreeing with the results already found. 

It may be proved in general that for any mechanical problem 
in which the potential is a quadratic function of the coordinates, 
coordinates of this kind (called normal coordinates) can be set 
up, having the property that they have no cross terms between 
different coordinates in either the kinetic or the potential energy, 
so that the Lagrangian function is a sum of squares of coordinates 
and velocities, with constant coefficients, and the variables are 
separated in the Lagrangian equations. The general method of 
setting up these normal coordinates follows exactly the model 
we have found for our simple problem. This is one of the few 
sorts of mechanical problems in which a general solution is possi- 
ble, for no such theorem holds with other laws of force. The 
equations of motion for the normal coordinates are just like those 
for harmonic oscillators, so that their solutions are sinusoidal 
vibrations. In general, there are then as many fundamental 
periods in the motion as there are constants, so that the motion 
is multiply periodic. 

The normal coordinates are of particular value when we come 
to discuss the action of external forces on the coupled systems. 
For suppose there are external forces F x and F 2 acting on the two 
particles respectively, in addition to the elastic forces already 
considered. Then we can set up the generalized forces acting on 
the two normal coordinates, by the method described in Chap. 
VIII. If these are F x and F Y , we have 

Fx = Fl JX + F ^ = a ' Fl + ^ 2 ' 
F r = F^ + F^ = a"F, + 0"F t . 

Then the equations of motion are simply 



d 2 Y 

w + °" 2Y - F 



showing that these normal coordinates have the same sort of 
equations of motion, under the action of external forces, as single 
oscillators. Thus the complete solution will be a sum of a partic- 
ular solution of the inhomogeneous equations, consisting of 
vibrations of the same nature as the external force, capable, 
therefore, of showing resonance phenomena, and of a general 
solution of the homogeneous equations, of the sort we have found. 

Fig. 18. — Rotation of coordinates. The distances OA and PA are the x 
and y coordinates of the point P, and OB and PB are the X and Y coordinates. 

Under certain circumstances, a damping force proportional to the 
velocity will also be expressed in terms of normal coordinates as 
a constant times the time rate of change of the normal coordinate, 
but this is not always true. We shall discuss this question in 
Chapter XIII. 

72. Relation of Problem of Coupled Systems to Two-dimen- 
sional Oscillator. — Our problem of two coupled one-dimensional 
oscillators reminds us strongly of the case of two-dimensional 
oscillators encountered in Chap. IX. Here, as there, we have 
two coordinates (xi and x 2 here, x and y there), and linear restor- 
ing forces. But the difference is that here the restoring force 
acting on each coordinate depends on the values of both. The 
corresponding problem in the two-dimensional case would be 
that where F f = — ax -f- cy, F y = — by + ex, where a — «i 2 , 


6 = w 2 2 . And obviously the problem can be solved just as we 
have treated our case of the coupled oscillators. That is, we 
introduce new variables X, Y, defined by the equations x = a'X 
+ a"Y, y = p'X + 0"Y, where the a's and jS's have the values 
found above, and in terms of the new variables X and Y we have 
separation, and get a solution in which X and Y execute periodic 
vibrations of different frequencies. 

But now we can get a very simple geometrical interpretation 
of our change of variables: it is merely a rotation of coordinates. 
To see this, let us first consider what a rotation of coordinates 
means analytically. In Fig. 18, we see old coordinates xy, and 
new, rotated ones, XY. The xy and XY coordinates of a point 
P are indicated. Now there is a very simple vector way of 
writing the coordinates. Let i, j be the unit vectors along x and 
y, respectively, and /, J along X, Y. Further, let r be the radius 
vector from the origin to point P. Then evidently we have 
x = (i • r), y = (j • r), X = (J • r), Y = (J • r). But we can 
express i and j in terms of I and J, or vice versa : 

i = (* • /)/ + (t • J) J, 

j = (i • i)i + (j • J) J. 

Hence we have 

x = (i • r) = (i • I) (I • r) + (i • J)(J • r) 
= {i-I)X + (i-J)Y, (13) 


y = (j ■ I)X + (j • J)Y. 

These are linear equations of just the sort already found and 
agree if 

(t ' I) = «', 
(*' • J) = «", 


(j • J) = 0". (14) 

We may not assume, however, that any linear transformation of 
this sort corresponds to a rotation; the general transformation 
would be to a stretched, oblique set of axes. For the new coordi- 
nates to be obtained from the old by merely rotating, we must 
have two conditions: (1) the vectors / and / must be at right 
angles, or orthogonal, to each other; (2) I and J must be of unit 
length, or, as we say, normalized. That is, in vector notation, 


(I • J) — 0, P = J 2 = 1. Now we can express these equations 
by taking components along the x, y axes: since I = (i • I)i + 

j.j = o= (i-i)(i-J) + U'i)U-J) 

= a'a"+W" ( (15) 

or the orthogonality conditions, which, we have already seen to be 
satisfied, and whose significance we now see. Also 

p = i = (i . 7)2 + (j ■ iy = a ' 2 + P' 2 

J2 = 1 = «"* + p"\ (16) 

or the normalization conditions, which we satisfied by proper 
choice of arbitrary constants. We can, in conclusion, make the 
following statement: any linear transformation in which the 
transformation coefficients satisfy the orthogonality and normali- 
zation conditions corresponds to a rotation of coordinates. 

The advantage of making our rotation is seen when we con- 
sider the mechanical problem. In the original problem we have 
force components F x = -ax + cy, F y = -by + ex. We can 
find the components of force in the new variables. Evidently 

F x = (F- i), F y = {F- i), and similarly 
F x = (F'l) = (F- i)(i ■ J) + (F • j)(j ■ I) = <*'F X + -pF, 
= a'(-ax + cy) + P'(-by + ex) 
= -(a'a - P'c)x - {-a'c + P'b)y 
' = - (a'a - fic)WX + a"Y) - (-a'c + f?b){fiX + fl"Y) 
= -(a' 2 a - 2a'P'c + P' 2 b)X - 

(a' a" a - a"p'c - a'&'c + 0'P"b)Y. 

But by results already proved, we easily see that the first paren- 
thesis equals co' 2 (or a corresponding expression in terms of a and 
6), and the second is zero, so that F x = -o/ 2 X, and similarly F Y 
turns out to be -w" 2 Y. In other words, by this rotation of 
axes, we have got each component of force to depend on displace- 
ment in that direction alone. Incidentally, the method of finding 
the components of a vector in rotated coordinates which we 
have used is of general application. 

The object of the rotation becomes even clearer when we 
consider the potential energy. This is the quantity whose x 
derivative is ax - cy, and y derivative is by - ex. First we 
note that dFJdy = dFJdx = c, so that the curl of the force is 
zero, and the potential exists. Then we easily see that V = 


\{ax 2 + by 2 — 2cxy), or ^(coi 2 x 2 + W2 2 ?/ 2 — 2cxy). An equi- 
potential, obtained by setting this expression equal to a constant, 
is an ellipse with its center at the origin, but with its major and 
minor axes inclined at an angle to the xy axes, unless c = 0. 
But now we have seen that the potential in the new coordinates 
has the expression V = |(a/ 2 X 2 + &/' 2 F 2 ). If this is equal to a 
constant, the result is the equation of an ellipse whose principal 
axes are along the X and Y axes. In other words, our whole 
change of variables has been a rotation of the coordinate axes to 
point along the principal axes of the elliptical equipotentials. 
The process of rotating axes to coincide with the principal axes 
of an ellipse or ellipsoid is a common thing in mathematical 
physics. We have already seen one example in the last chapter, 
where we had the ellipsoid of inertia, and used the principal axes 
as coordinates. Other illustrations come from the theory of 
elasticity, where there is an ellipsoid of stress at each point, and 
we often use the principal axes of stress as coordinates. Again, 
in wave mechanics, examples of the same sort of process are 
constantly found. 

73. The General Problem of the Motion of Several Particles. — 
The present problem is the first one we have met in which there 
are several particles interacting with each other, and it has illus- 
trated one of the useful methods of attack on such a problem. 
This is to take all the coordinates, whether they refer to one or 
another particle, and imagine them all plotted in a many- 
dimensional space, like the phase space which we discussed in 
connection with the Hamiltonian method, but with only enough 
dimensions to take care of coordinates, not of momenta. Such 
a space is often called a configuration space. Then the motion 
of the system is given by the motion of a point in configuration 
space. If there is a potential, it is a function of position in 
configuration space. We can then apply many of the same ideas 
to the motion of the point in many-dimensional space that we 
would to the motion of a single particle in three-dimensional 
space. Thus there will be parts of configuration space where 
E — V is positive; there the point can go, but it cannot enter the 
regions where E — V is negative. In some cases, changes of 
variables in configuration space can simplify the problem 
enough so that we can separate variables, or at least go far 
toward a solution. The present chapter has supplied one 
instance. Another is found in the problem of two particles, as 


the earth and sun, exerting forces on each other but not being 
acted on by outside bodies. There we can introduce new coordi- 
nates: first, the three coordinates of the center of gravity of the 
system; second, the coordinates of one particle relative to the 
other. And in terms of these new coordinates, the three coordi- 
nates of the center of gravity become separated from the others, 
resulting in a uniform motion of the center of gravity in a straight 
line, and the relative motion reduces to a problem mathematically 
equivalent to the motion of a single particle in three-dimensional 
space. The changes of variables used in these cases generally 
have the property, which we have noted in the present case, of 
mixing up the coordinates of two or more particles in a single 
generalized coordinate. 


1. Two balls, each of mass m, and three weightless springs, one of length 
2d, the others of length d, are connected together in the arrangement spring 
d — ball — spring 2d — ball — spring d, and the whole thing is stretched in a 
straight line between two points, with a given tension in the springs. Grav- 
ity is neglected. Investigate the small vibrations of the balls at right angles 
to the straight line, assuming motion only in one plane. Show in general 
that there are two modes of vibration, one having the lower frequency, in 
which both balls oscillate to the same side at one time, then the other, and 
the second mode, with higher frequency, where they oscillate to opposite 
sides. (Hint: if the first is displaced xi, and the second x 2 , and if these 
displacements are so small that the tension t is unchanged, then there will 
be two forces acting on the first ball: a force t toward the point of support, 
making an angle whose tangent is Xi/d, and another directed toward the 
second ball, at an angle whose tangent is {x 2 — Xi)/2d. The component at 
right angles to the straight line, and thus producing the motion, is then 
—xi (t/d) + (#2 — xi) (t/2d). Similarly the force on the second is —x 2 (t/d) 
+ (xi - x 2 )(t/2d). 

2. Assume two resistanceless circuits, one with Li, Ci, the other with 
L 2 , C%, coupled together by having a mutual inductance M between the two 
inductances (that is, back e.m.f . of self- and mutual inductance is — Li dii/dt 

— M dii/dt in the first circuit, and — L 2 dii/dt — M dii/dt in the second 
circuit, where i\, i% are the currents in the circuits). Find the frequencies 
of the natural oscillations of the coupled system. 

3. In Prob. 2, assume that the circuits have small resistances R i and R 2 , 
respectively, so small that the logarithmic decrements of the separate cir- 
cuits are small. Discuss the damped oscillations, showing that the solution 
can be carried out if squares of resistances are small enough to be neglected, 
but that it leads to a biquadratic equation for the frequency for large R. 
(Hint: write the frequency as the sum of a real and an imaginary part.) S 

4. Two identical pendulums hang from a support which is slightly yield- 
ing, so that they can interchange energy. Assume that coupling is linear. 
Now suppose one pendulum is set into motion, the other being at rest. 


Show that gradually the first pendulum will come to rest, the second taking 
up the motion, and that there is a periodic pulsation of the energy from one 
pendulum to the other. Show that the frequency of this pulsation gets 
smaller as the coupling becomes smaller, until with an infinitely rigid 
support the energy remains always in the first pendulum (this is all without 
damping forces). 

5. One simple pendulum is hung from another; that is, the string of the 
lower pendulum is tied to the bob of the upper one. Discuss the small 
oscillations of the resulting system, assuming arbitrary lengths and masses. 
Use the angles which each string makes with the vertical as generalized 
coordinates. In the special case of equal masses and equal lengths of 

strings, show that the frequencies of the motion are given by v g(2 ± \/2)/l. 

6. Show that if the mass of the upper pendulum becomes very great 
compared with the lower one, the solution of Prob. 5 approaches that of 
Prob. 8, Chap. IV. Show in the other limiting case, where the upper mass 
is small compared with the lower one, that the motion consists approxi- 
mately of an oscillation of the large mass with a period derived from the 
combined length of both pendulums, and a more rapid oscillation of the 
small mass back and forth with respect to the line connecting point of 
support and large mass. 

7. Given an ellipse ax 2 + bxy + cy 2 = d, perform a rotation of axes so 
that the new coordinates will lie along the major and minor axes of the 
ellipse. From this rotation, find the angle between the major axis and the 
x axis, in terms of the coefficients a, 6, c, d. It is simplest to write 
the transformation directly in terms of the angle 6: x' = x cos 6 + ysia 0, etc. 

8. Show that if the equations 

x' = anx + a x2 y + a lz z, 
y' = a 21 x + a 2 2y + a 2S z, 

z' = a 3 lX + «322/ + 0332 

represent a rotation of coordinates, the a's satisfy orthogonality and nor- 
malization relations, both of the form anai 2 + a 2i a 22 + «3i«32 = 0, a 2 u + 
a 2 2i + a 2 3i' = 1, and of the form ana 2 i + a,i 2 a 22 + 013023 = 0, a 2 n + a 2 12 + 
ah 3 = 1. 

9. In the rotation of coordinates above, show that the inverse transforma- 
tion is given by 

x = a u x' + a 2 iy' + a 3 iz', ♦ 
y = a 12 x' + a 22 y' + 0322', 
z = ai3X r + a 23 y' + 0332'. 

Prove that the determinant of the a's is equal to unity. 

10. Find the components of an arbitrary vector in the rotated set of 
coordinates given in Prob. 8. Show that the components of grad V, where 
V is a scalar, in the rotated axes, are dV/dx', dV/dy', dV/dz'; that is, that 
the gradient is invariant under a rotation of axes (has the same form in the 
new axes as in the old). 

11. Prove that the divergence, curl, and Laplacian are invariant under a 

12. Set up a method for getting the direction cosines of the principal 
axes of inertia of a body, and the values of the principal moments of inertia, 
if the moments and products of inertia are known in a particular coordinate 



In this chapter we turn to the discussion of the motion of a 
continuous medium. There are examples of such motion in one, 
two, or three dimensions; as a vibrating string in one dimension, 
a membrane in two, and an elastic solid, or gas, in three dimen- 
sions. We first consider the motion of a one-dimensional body, 
or string. Suppose we have a string of length L, mass n per unit 
length (constant), with a tension T, set into transverse vibrations. 
From our elementary work, we know that an infinite number of 
modes of vibrations, or overtones, are possible. For the nth 
overtone, if it is present alone, the shape of the string at any time 
is given by sin (ranr/L), where x is the coordinate of a point on the 
string measured from one end, and the function is proportional 
to the displacement transverse to the string. The frequency of 
this overtone is a)„/2ir, where c*>„ = {mr/L)\^T/n. Thus if A n 
4 is- •the complex amplitude of this overtone, and u is the displace- 
ment of the point x, we have 

= real part of ^,A n sin — t-C 


n = l 

where we sum over all the possible overtones. Our first task is 
to derive these results from fundamental principles. 

74. Differential Equation of the Vibrating String. — Assume 
that at a given time the string is displaced so that its shape is 
given by u{x). We consider how this curve will change with 
time, and consider transverse displacements so small that the 
tension T may be considered constant throughout the string. 
Take a short element of the string of length dx aiid mass ixdx. 
Its acceleration is d 2 u/dt 2 (x kept constant), so that its mass 
times its acceleration is ix dx d 2 u/dt 2 . This must be equal to the 
force acting on this element which arises from the tensions. 
These tensions (which we take equal to each other in magnitude) 
would cancel each other exactly if the string were straight, but 
when it is curved, they each give ris* 3 ! to components approxi- 



mately perpendicular to the string which vary with the curvature 
of the string (see Fig. 19). At any point x, this component is 
approximately T du/dx, and we work only to the approximation 

T^- >r 





e \X 




" B 

x x+ax 

Fig. 19. — Tensions on an element of string. 

Vertical component at x is 

— T sin 0. If we approximate sin by tan 0, this is — T^ Similarly at 

x + dx, the component is + T — » but now computed at x + dx. 


to which this is true. Thus the total force on the element of 
string is 

J/su\ _ (e_u\ 

l\dx/ x+ d X ydx/t 

= T—dx 
1 dx* ax ' 

if we expand the first term in a Taylor's series and retain only 
the first two terms of the expansion. Thus our equation of 
motion is 

d 2 u . „d 2 u 

V-^d* = T d^ dx > 


d 2 u m d 2 u 

dt 2 

dx 2 


This is a partial differential equation, since it contains partial 
derivatives. This appearance of partial derivatives is charac- 
teristic of all equations of motion of continuous media. Since 
the equation is linear, with constant coefficients, let us try to 
solve it by the exponential method, assuming u = e i(ut+kx \ as 
would be suggested by the solution in terms of overtones. The 
equation of motion leads immediately to —/j,io 2 /T = —k 2 , 
determining « in terms of k. Combining two exponential solu- 


tions, allowable since the equation is linear and homogeneous, 
we have 

u = Ae iut sin kx, 

u = Be 1 " 1 cos kx. (2) 

Now we must introduce the boundary conditions, which tell us 
that the string is held fixed at both ends, so that u = when 
x = 0, and when x = L. From the first of these conditions 
B = 0, and we take only the sine function. From the second, 
we must have sin kL = 0, or kL = nr, 

7 rnr 
k= T , 

where n = 1, 2, 3, • • 
Hence the solution is 

u = Ane^n* sin —j-> (3) 



3n " T\m ' 

Superposing the solutions of all the different n's, as we may from 
the nature of the differential equation, we obtain the solution 
mentioned at the beginning of the chapter. 

Our differential equation is a linear homogeneous partial 
differential equation of second order. As such, any linear com- 
bination of solutions is itself a solution. But now we have, not a 
small number of arbitrary constants, but the doubly infinite set 
A n , n = 1, 2, • • • (A n is complex). This is characteristic of 
all partial differential equations. Sometimes instead of having 
an infinite set of arbitrary constants, we have an arbitrary func- 
tion. In our case the A's are determined by giving the amplitude 
and phase of each overtone. These must be determined from 
the initial conditions; that is, from the values of u(x) and u(x) 
at t = 0. The essential point is that our partial differential 
equation is equivalent to an infinite number of ordinary differ- 
ential equations, so that we need an infinite number of constants. 

75. The Initial Conditions for the String. — Suppose we wish 
to satisfy initial conditions of the following sort for a vibrating 
string: at t = 0, the displacement and velocity are given func- 
tions of x. That is, if the displacement is u(x, t), then u(x, 0) = 


f( x )> -^r(, x > 0) = F(x). where f{x), F{x), are arbitrary functions. 

Now we may write 

u(x, t) = >. (C n cos oi n t + D n sin wj) sin -=-> (4) 

using the real form of the function of time, and 

u{x, t) = ^ ( — C n o>n sin co n t + Z)„co„ cos o) n t) sin — =-• 


Thus we must have 

u(x, 0) = /(a:) = ^C„sin 

i(x, 0) = F(x) - ^> D„a>„ sin ^- (5) 


To satisfy either of these conditions, we must be able to expand 
our arbitrary function in series of sines, and to find the coeffi- 
cients C n or D n of these expansions. Having found the coeffi- 
cients, we can at once set up the series for u(x, t). This is a special 
case of Fourier expansion, and we now proceed to consider the 
general problem of Fourier series, a question of general interest 
apart from the application to a string. 

76. Fourier Series. — We shall state Fourier's theorem. 
Given an arbitrary function <l>(x). Then [unless (f>(x) contains 
an infinite number of discontinuities in a finite range, or similarly 
misbehaves itself], we can write 


±( \ ^-o , ^/ a 2mrx D . 2mrx\ 

4>{x) = -s- + >( A n cos -^r- + B n sin 

2 ■ ^jy-n— x . -»— x 



'*/2 0„ ' O r^/2 

< 2 f J,/ ^ 2w7rx ^ c 2 f ^ • 2w,r;c ^ ra\ 
/In = ^? I <£(£) cos — =r- as, £$„ = ^ I #(£) sin -^r- cte. (6) 

This equation holds for values of x between — X/2 and X/2, but 
not in general outside this range. The series of sines and cosines 
is called Fourier's series. Obviously a special case of it could be 
used in our problem of the string, the case where only the coeffi- 
cients of the sine terms were different from zero. 


There are two sides to the proof of Fourier's theorem. First, 
we may prove that, if a series of sines and cosines of this sort can 
represent the function, then it must have the coefficients we have 
given. That is simple, and we shall carry it through. But, 
second, we could show that the series we so set up actually 
represents the function. That is, we should investigate the 
convergence cf the series, show that it does converge and that its 
sum is the function <i>{x). This second part we shall omit, merely 
stating the results of the discussion. 

77. Coefficients of Fourier Series. — Let us suppose that 
<j>(x) is given by the series above, and ask what values of ^4's and 
B's we must have if the equation is to be true. Multiply both 
sides of the equation by cos (2mirx/X), where m is an integer, and 
integrate from —X/2 to X/2. We have then 

A i) 2mirx 

2 X 

rx/2 2mirx f X/2 \ 

_i_ '^ / . 2mrx 2mirx . n . 2mrx 2mirx\ I 7 
~ ^j( A n cos -^r- cos — y r B n sin -^=- cos „ \>dx. 

But now we shall show in the next paragraph that 


x/2 2nirx 2mirx , ' n /m 

cos v cos „ ax = 0, (J) 

-X/2 -X- ' A 

if n and m are integers, unless n = m, and that 


x/2 . 2mrx 2mirx , _ 
sin „ cos Y ax = 0, 

-X/2 A A 

if n and m are integers. Thus all terms on the right are zero but 
one, for which n = m. The first term falls in with the rule, 
when we remember that cos = 1. This one term then gives us 


A m | COS 2 * dx = A m jrl 

■ X/2 ^- ^ 

as we can readily show. Hence 

_2f J 
n Xj- 

x/2 , N 2mrx , 

4>{x) cos v ax. 
X/2 - x - 

In a similar way, multiplying by sin (2irmx/X), we can prove the 
formula for B„- 

In our derivation of coefficients, we have used the following 

results : cos -^ cos -^^ dx = 0, if n, m are different inte- 

J-x/2 A A . 

gers, and similar relations with sines. We can prove these very 
easily from trigonometry. Thus 

[cos (a + b) + cos (a — &)] 
cos a cos o = 2~ ' 

so that our quantity is the integral of this, or 

. 2t(u + m)x . 2t(w — m)x 
sm — ^— =^ — — sin ^ 

27r(n + m) 2ir(n - m) 



But the quantity in brackets is zero at both limits, if n, m are 
integers, and the result is zero. Such proofs hold in the other 
cases. The exception, of course, is the case n = m, in which the 
integrand is \ (cos (4amx/X) + 1), so that, while the first term gives 

1 C x/2 X 

no contribution to the result, the second gives ^ I dx = -^ 

78. Convergence of Fourier Series. — In this section we shall 
merely quote results. In the first place, the series cannot in 
general represent the function, except in the region between 
— X/2 and X/2. For the series is periodic, repeating itself in 
every half period, while the function in general is not. Only 
periodic functions of this period can be represented in all their 
range by Fourier series. If we try to represent a nonperiodic 
function, the representation will be correct within the range from 
-X/2 to X/2, but the same thing will automatically repeat 
outside the range. Incidentally, we can easily change the range 
in which the function is correct. If we merely change the range 
of integration so as to be from x Q to x + X, where x is arbitrary, 
the series will represent the function within this range. The 
case we have used above corresponds to x = —X/2; another 
choice frequently made is x - 0. Then again, if we change the 
value of X, we can change the length of the range in which the 
series is correct. To represent a function through a large range 
of x, we may use a large value of X. 

Although the range within which a Fourier series converges 
to the value of the function it is supposed to represent is limited, 
as we have seen, there is a compensation, in that within this range 


a Fourier series can be used to represent much worse curves than 
a power series. Thus the convergence of the series is not impaired 
if the function has a finite number of discontinuities. It can 
consist, for example, of one function in one part of the region, 
another in another (in this case, to carry out the integrations, 
we must break up the integral into separate integrals over these 
parts, and add them). The less serious the discontinuities, 
however, the better the convergence. Thus if the function itself 
has discontinuities, the coefficients will go off as 1/n, while if 
only the first derivative has discontinuities the coefficients go off 
as 1/n 2 , and so on. Differentiating a function makes the con- 
vergence of a series worse, as we can see, for example, if a function 
is continuous but its first derivative discontinuous. Then the 
coefficients go off as 1/n 2 , but if we differentiate, the coefficients 
of the resulting series will go off as 1/n. There is an interesting 
point connected with the series for a discontinuous function. 
If the function jumps from one value U\ to another u 2 at a given 
value of x, then the series at this point converges to the mean 
value, (m + w 2 )/2. 

79. Sine and Cosine Series, with Application to the String. — 
In the special problem of the vibrating string, the series we 
require is somewhat different from the general case, in that there 
are only sines, and not cosines. We are therefore led to investi- 
gate series of sines only, or of cosines only. Suppose we take 


the series — ° + \i„ cos —S^-' the series formed by taking 

n = l 

the cosine part of the general Fourier series. Now each one 
of the terms is even in x; that is, if we interchange x with — x, 
the function is not changed. A cosine series represents there- 

fore an even function. Similarly the sine series ^5„ sin ^^> 

of which each term is odd, represents an odd function (one for 
which, if x is interchanged with —x, the function changes its 
sign but not its magnitude). It is well known that any function 
<j>(x) can be written as the sum of an even and an odd function: 
0(z) = hi4>(x) + <f>(-x)] + i[<t>(x) - 4>{-x)], of which the first 
term is even, the second odd. Thus the cosine part of a Fourier 
series represents the even part of the function, the sine series 
the odd part. As a corollary, any even function can be repre- 


sented by a cosine series alone, an odd function by a sine series. 
Now suppose we are really interested in a function only 
between and X/2, and that we do not care what the series does 
outside that region. Then we may define an even function 
<f> e (x) as follows: it equals the given function <t>(x) between and 


Fig. 20. — A function, with even and odd periodic functions made from it. 
The even and odd functions, <t> e (x) and <f>„(x), agree with the original function 
4>{x) between and X/2. Between and -X/2, <t> e {x) is the mirror image of 
4>(x), while <f>o{x) has the opposite sign. Outside the region from —X/2 to X/2, 
both functions repeat periodically with period X. 

X/2, but has just the same value for — x that it has for x (see 
Fig. 20). Outside the range from — X/2 to X/2, it repeats itself. 
The Fourier representation of <£ e will be a cosine series, but will 
represent our given function # correctly between and X/2. 


Evidently it is the series -^ + ^?A„ cos — ^' where we write 

n = l 

the coefficients as the sum of two integrals, 

A n = 



X/2 2tvkx 

4> e (x) cos — yr- dx 

-X/2 A 

, t . 2mrx , 
<£( — x) cos ■ y dx + 


= X I *^ cos ~X~ 


xt* , N 2utx , > 
4>(x) cos y dx 


Similarly we may define an odd function <f> (x), which equals <f>(x) 
between and X/2, but at —a: has the negative of its value at -\-x. 


This function is represented by a sine series ^ B n sin *' > 

n = X 

where we readily see that 

B n = v I 0W sin — ^- dx. 

jr ■ V w oi" — J^" 

Hence, between and X/2, the same function can be represented 
by either a cosine or a sine series. But outside this range, the 
series represent quite different functions. 

Our sine series can now be applied to the string problem. 
We are interested in the string between and L. Let us then 
set L = X/2. The expression then becomes 


<t>(x) = ^>B n sin -^-> (8) 

n = l 


B n = y I ^w sin ~^r dx. 

This can be used first to find the coefficients C n , from Eq. (5), 
substituting u(x, 0) for <t>(x), and next to find the quantities 
D n a> n , substituting u(x, 0) for <i>(x), D n w n for B n , and obtaining 
D n by dividing through by «». These formulas then suffice to 
find the constants C» and D n of the motion of the string, knowing 
the initial displacement and velocity of every point of it. 

80. The String as a Limiting Problem of Vibration of Particles. 
An excellent insight into the problem of the vibration of a string 
is obtained by regarding it as a limiting case of mechanical 
systems with a finite number of particles, having theref ore a finite 
set of arbitrary constants in the solution. This is the method 
followed by Lagrange. Suppose we have N equal masses m 
at the points x = d/2, 3d/2, • • • (N — |)d, separated by 
massless springs, the whole being stretched with a tension T 
between supports at x = and x = Nd = L. This forms an 
approximation to the continuous string, if n = m/d, the mass 
per unit length. We again investigate the transverse vibrations, 
letting the displacement of the iih particle be Ui. The problem 


is similar to Problem 1, Chap. XI. The force on the ith particle 

T T 

Fi = —{ui — Ui-i)j — {ui — Ui+i)-j> 

except for the first particle, where we have 

p 2T i \ T 

and for the iVth, 

T 2T 

F N = —(u N — u N -i)j — u n~j- 

Then, assuming a solution m = d r',w e nave tne N equation 6 
of motion in the form 

(—» 2 + t) Ci - h = ° 
_t Ci + (_ w + wy _ t Cz = 

-?Ci + (^ -mco 2 + ^Vs - ?C 4 = 

+ (""- ' + T) C3 " f ( 

- jC N -i + (-m« 2 + ^V* =0. (9) 

Such a set of equations, all alike in form (except here for the first 
and last), are called difference equations. As in the last chapter, 
these have a solution only for certain values of co, given by setting 
the determinant of coefficients equal to zero. The determinant 
is now too complicated to handle simply, wherefore we adopt 
another method of procedure. Suppose we let C } - = e ikj , where 
k is to be determined from the equations above. All the 
equations except the first and last take the form 

_:L*«-i> + ( -mco 2 + 2 j )e ik > - -e ifc ^' +1) = 

-2j cos k + (-mo 2 + 2j) = 0, 


ww 2 = 2^(1 - cos k). (10) 



That is, for any «, we can choose a value of k by this relation, 
so that all the equations except the first and last are satisfied. 
These fall into line as well if we set up C = —C h and C N+1 = 
— Cat, so that if these conditions are satisfied we have e ik \ or 
equally well e~ ik] ', or sin kj or cos kj, as solutions of the equations 
for Cj. These conditions on C and C N+ i are essentially boundary 
conditions, one at each end of the string, and we readily see that 
they are satisfied if we make our function zero for x = 0, x = L, 

as we do if it is sin -y-, where n is an integer. That is, since 

x is U ~ h)d for the jth particle, we have 

C in = sin ~(^j - ^ j> (11) 

so that k = mr/N. We see from this form of C in that C 0n = — C ln , 
and that only those values of n up to N give us different sets of 
C's. If n is greater than N, then for each integral j we get just 
the same value of C,„ that we had for a certain n less than N, 
so that the whole scheme repeats itself over and over as n 
increases, and we really have only N distinct solutions. Similarly 
in the expression for the frequency, the term 1 — cos k = 1 — 
cos (mr/N) is periodic, so that as soon as n becomes greater than N 
we repeat the frequencies already found. There are, then, 
just N solutions, each with its frequency and its complex ampli- 
tude for each particle. This fits in with the single frequency 
for one particle and the two which we have found for two coupled 
particles. For each of the N particles there is an arbitrary 
amplitude and phase, or arbitrary complex amplitude, so 
that there are just 2N arbitrary constants. The whole solu- 
tion is the sum, as n goes from 1 to N, of the real parts of 

Ane^n 1 sin -y— , or 



= ^ i B n sin ^ cos (wj - €„). (12) 

n = l 

Each one of these terms represents the amplitudes of all the 
particles when vibrating with a particular .mode of motion, 
analogous to an overtone of the string. To get the amplitude 
of the jth particle, we set x = (j — \)d. The angular velocity 
w„ of the nth overtone is given by 


mco„ 2 = 2^1 - cos ^ (13) 

81. Lagrange's Equations for the Weighted String. — The 

equations of motion which we have discussed above may also 
be obtained readily from Lagrange's method, and we shall set up 
expressions for the kinetic and potential energies. For the 
kinetic energy 7\ we have simply 

T 1 = ^W + ^ + • • • + un*), 

and for the potential energy 

V = ^[2wi 2 + (w 2 - wi) 2 + (u 3 - w 2 ) 2 + • • • + 2u N \ 

and the Lagrangian equations 

d U(T,- 7) 1 _ d{T x - V) = Q 
dt\^ diij J dUj 

lead to the equations already used. 

82. Continuous String as Limiting Case. — The solution we 
have found for the set of particles differs in two ways from the 
solution for the continuous string. First, there is only a finite 
set of overtones, and secondly, the frequencies are determined 
by different formulas. Both these differences disappear when 
the number of particles in the fixed length L becomes infinite. 
To determine the limiting form of the expressions for the fre- 

quency, we develop cos ^ina power series for large N. We 

thus obtain 

cos y = 1 - si tf ) + 



so that w n becomes 

l/mr\ T _mr IT 

using Nd = L and m = nd. This agrees with our former result. 
In this limiting case of infinite N (and infinitesimal d) the expres- 
sions for the kinetic and potential energies become 

Ti = 

$«* c - tadr -UX£)'*> (14) 



which may also be derived directly for the case of a continuous 


1. Taking the case of four particles on a string, derive their displacements 
in the four possible normal vibrations, and compute their frequencies. 
Compare these frequencies with the first four frequencies of the correspond- 
ing continuous string. Put in n = N + 1, and show how the solution 
reduces to one already found. 

2. An actual string is composed of atoms, rather than being continuous, 
so that it has only a finite number of possible overtones. Assume that it 
consists of a single string of atoms, spaced 10 -8 cm. apart. Let the string 
be 1 m. long, and at such tension that its fundamental is 100 cycles per 
second. Find the frequency of the highest possible harmonic, and show 
that it is in the infra-red region of the spectrum. Show that in this highest 
harmonic, successive atoms vibrate in opposite phases. Substances actually 
have such natural frequencies in the infra-red 3 and they are important in 
connection with their specific heat. 

3. Prove that u = sin u[t — (x/v)] is a solution of the partial differential 
equation for the vibrating string, if v is chosen properly, although it does not 
satisfy the boundary condition that the string be held at the ends. Con- 
sider the physical meaning of this solution, and show that it represents a 
wave traveling down the string with velocity v. 

4. Superpose the wave of Prob. 3, traveling along the +x axis, and a 
similar one traveling in the opposite direction, and show that the sum repre- 
sents a standing wave of the type discussed in this chapter. 

5. Find the wave length of the waves in the string, in the solution we have 
found in this chapter, and verify the relation v — rik between wave length X, 
frequency n, and velocity v. 

6. Proceeding as in Prob. 5, find the velocity of a wave along the weighted 
string, showing that it varies with frequency. Find a formula for the 


Artificial electric line. 

7. An artificial electric line can be constructed according to Fig. 21, 
consisting of N identical resistanceless circuits, each containing inductance 
L, capacitance C, and coupled to each of its neighbors with mutual induc- 
tance M. Set up the differential equations for the currents tin the various 
circuits, showing that they reduce to the same form as with the weighted 


8. Neglecting boundary conditions at the two ends of the line in Prob. 7, 
show that a disturbance can be propagated along the line with a definite 
velocity, as in Prob. 6. 

9. A string of length L is pulled aside at a point a distance D from the end, 
and then released. Thus its initial shape is given by a curve made of two 
straight lines, and its initial velocity is zero. Find the solution for its 
motion, and find the amplitude of the nth harmonic. 

10. Taking the solution of Prob. 9, for the special case where D = L/2, 
compute the first five terms of the Fourier series, when t = 0. Add them 
and plot the sum, showing how good an approximation they make to the 
correct curve. 

11. A string initially at rest is struck at a distance D from the end, at 
t = 0. Find the intensity in each overtone. Approximate the initial 
conditions as follows: the initial displacement is zero, and the initial velocity 
is a constant in a small region of length d about the point D, zero elsewhere. 


In the preceding chapter, we have worked out the elementary 
theory of the vibrating string, finding the nature of the possible 
vibrations and the method of getting the amplitude of the over- 
tones in terms of the initial conditions. When we begin to ask 
about slightly more complicated problems, however, we find 
that it is necessary to go further into the theory. For example, 
we might be interested in the nature of the forced vibrations 
under the action of an external sinusoidal force, or the effect of 
damping on the oscillations. Such questions are easily answered 
by introducing normal coordinates, much as we did with the two 
coupled oscillators. These are generalized coordinates, which 
prove to be closely connected with the various overtones, so that 
if just one normal coordinate is vibrating, that means that the 
string is vibrating with the corresponding pure overtone. When 
we write Lagrange's equations in terms of the normal coordinates, 
we find that we can introduce external forces easily, and solve 
such problems. 

At the same time, the general theory of normal coordinates 
for vibrating strings, which we shall get into, has particularly 
interesting relations to many other branches of mathematical 
physics. We shall gain much more insight into Fourier expan- 
sion, finding a general theory of expansion of which this is a 
special case, but which, as we shall find later, includes expansions 
in Bessel's functions, spherical harmonics, and many other sorts 
of functions. Such problems are met not only in vibrations, 
but also in heat flow (for which Fourier series were originally 
developed), potential theory, hydrodynamics, and in the newest 
branch of mathematical physics, wave mechanics, or the quantum 
theory, used in studying atomic structure. 

83. Normal Coordinates. — In Chap. XI, we investigated the 
vibrations of two coupled particles and set up normal coordinates 
to describe the motion. Since we must make a considerable 
extension of the idea of normal coordinates in the present chapter, 
it will be best to review the results we have already found. We 



started with two coordinates, xi and x 2 , describing the displace- 
ment of the two particles. The normal coordinates X and Y 
were introduced by a linear transformation 

X X = a'X.+ ct'Y 

z % = P'X + /3"F, 

which proved to be merely a rotation of axes in the xi — xi space, 
so that X and Y were new orthogonal axes in that space. To 
express the fact that the transformation was just a rotation, we 
had certain conditions holding between the coefficients: orthog- 
onality conditions, as a' a" + 0'p" = 0, and normalization 
conditions, as a' 2 + jS' 2 = 1. We saw that the quantities a', 
a", /3', /3", had a geometrical meaning: a' was (i • /), and similar 
relations for the other quantities, showing that a', &', were the 
components along the xi and x% axes, respectively, of unit vector 
along X, and a", 0" similarly were components of unit vector 
along Y. The object of the rotation to normal coordinates was 
to separate the variables of the equation of motion, so that each 
normal coordinate executed a vibration of its own, as X = Ce™' 1 , 
Y = De Ua " t . This was equivalent to rotation so that the new 
axes in the Xix 2 space lay along the principal axes of the elliptical 
equipotentials of the problem. 

We can now follow exactly the same model in our problem of 
the string. We start with the case of n weights separated by 
springs. By analogy, the displacement of the first weight should 
be a linear combination of normal coordinates, the coefficients 
(corresponding to the a's and /3's) being the displacements of 

tittx 1 
the first weight when only one overtone is excited, or sin — =— 


for the nth overtone. The displacements of these weights are 

taken to be U\ . . . u N . Then we set up N normal coordinates, 

0i . . . <t> N , by the equations 

"^"1 mrx\ , 

Wi = >,«» sm —j— <t>* 



/,o» sin -t^-2 ■ <t> n , etc., (1) 

where Xj = (j — \)d, and the numbers a n are determined by a 
condition soon to be described. Here the quantities a„ sin 
nxxj/L correspond to the a's and 0's of the preceding chapter. 


But not only that: the coefficients still satisfy orthogonality 
and normalization conditions. The orthogonality conditions 
will be of the form 

i sii 

a n a m \ sin —^— sin — ^— + sin — y— sin — f (-. 

In I 

. . nicx N . rrnrx N \ A /n . 
-f- sin — j — sin — j — ■ J = 0, (2) 

where n, m are any two indices. This is true, as can be shown by 
trigonometrical manipulation, though we shall not stop to do it. 
Similarly the normalization will be 

sin* ^ + sin* ^ + • • • + sin* ^ = 1. (3) 

We can satisfy this by proper choice of a n , since the parenthesis is a 
definitely determined, positive quantity. This is then the condi- 
tion, called the normalization condition, for determining the 
constants a n . Since we have as before an orthogonal transforma- 
tion, we can again get a geometrical interpretation. We imagine 
an iV-dimensional space, in which the quantities U\ ... u N are 
plotted as coordinates. Now our transformation of axes is 
equivalent to a rotation of coordinates in this JV-dimensional 
space. The normal coordinates <£i . . . <j> N represent new orthog- 
onal axes in the space, in the sense that if 0i = 1, all the other 
<£'s are zero, the corresponding point is displaced from the origin 
unit distance along the 4>i axis. The quantities like a n sin 

— y^ represent the components in the direction of the old axes 


of unit vectors along the new axes. Thus the one written is the 
cosine of the angle between the 4> n and the Xj axes. The equa- 
tions of motion are separated in the new coordinates, the solutions 
being <f> n = constant X e ia n l . Finally, the equipotentials, which 
are ellipsoids in the iV-dimensional space, have principal axes 
in the directions which we have chosen for the normal coordi- 
nates. Thus the analogy with the two-dimensional problem is 
complete. The statements we have made without proof here 
are not very difficult to demonstrate, and some of them are taken 
up in problems. 

We can now go one step farther, to the continuous string. 
Here the displacement of a point of the string is given by u(x), 
where x measures the coordinate of the point, corresponding to 
the in for the problem of discrete weights. We introduce normal 


coordinates <j> h . . . <j> n , . . . , (an infinite set, as there are an 
infinite number of points on the string), by the equation 

u{x) = ^Sa n mx r ^ 4> n . (4) 

The orthogonality conditions for the coefficients a n sin -y- must 

now be written in terms of integrals, rather than sums; for we 
have terms for each value of x, from to L, differing by infinitesi- 
mal amounts. Thus these conditions are 

a n a m sin -— sin — — - ax = 0, (5; 


L L 


where n and m are different integers. This can be immediately 
proved by evaluating the integral. Similarly the normalization 
condition is 

a„ 2 sin 2 ^ dx = 1. (6) 

which as before serves to determine a n . 

84. Normal Coordinates and Function Space. — We must now 
imagine a space, not of N dimensions, but of an infinite number. 
We cannot get an idea of what the coordinates mean, except by 
passing to the limit from the case of a finite number of mass 
points. With N points, and N dimensions, the first coordinate 
measures the displacement of the first mass point, and so on. 
Thus a point in the iV-dimensional space determines all the 
coordinates, or in other words gives the displacements of all the 
masses. Now as N gets larger and larger, and we have more 
and more dimensions, it still remains true that a particular 
coordinate measures the displacement at a particular part of the 
string. We see that this interpretation persists to the limit of 
infinitely many variables: each coordinate is connected with a 
point of the string, and its value gives the displacement at that 
point. But there is now an interesting side light on the situation. 
A point in our infinitely many-dimensional space gives complete 
information about the displacement of each point of the string. 
That is, it gives u{x), a function of x. Each point of this space 
is connected with a particular function, and each possible function 
is represented by a point of the space (of course, many points 
of the space refer to discontinuous functions and, therefore, are 


not suitable for describing a string). On account of this prop- 
erty, our space is often called a function space. 

The normal coordinates now represent a set of rectangular 
axes in function space, rotated with respect to the original coordi- 
nates. Each normal coordinate refers to a particular mode of 
vibration v or overtone. If just one of the normal coordinates is 
excited, say if <f> n = 1, all the other <£'s being zero, the situation is 
represented by a certain point in function space; that is, by a 
certain function, giving the shape of the string. We can take 
the radius vector out to the point <f> n = 1, all other <j>'s = 0, and 
project it on one of our original coordinates. Thus the projection 
on the coordinate connected with the point x is a n sin (rnrx/L), 
showing that that is the displacement of this particular point of 
the string when this overtone alone is excited with unit amplitude. 
The expression a n sin (nwx/L) is now a function; it is the function 
represented by a unit vector along the <t> n axis, in function space. 
Since the <£ axes are orthogonal, we see that the scalar product 
of two such vectors along different axes is zero : 

J* L . nirx . rrnrx ■, n 

a n a m sin -— sin — =- ax = 0, 
o L, L 

where by analogy the scalar product takes this form, so that we 
have the orthogonality conditions and their geometrical meaning. 
Similarly the square of the unit vector, which is unity, is 

a n 2 sin 2 ^- dx = 1, 


the normalization condition. This immediately gives a„ 2 L/2 = 
1, a n = y/2/L. 

Now as before, when we introduce the normal coordinates, 
we have rotated axes in function space to make the new coordi- 
nates lie along the principal axes of the ellipsoidal equipotentials. 
And the equations of motion are separated, each normal coordi- 
nate vibrating with simple harmonic motion: <£„ = A n eW. 
Finally, then, the motion is represented by 



u(x) = 2*\I sin TT ^ 

n = l 


agreeing with the value found previously. In Section 86, we 
carry out the demonstration that the equations of motion are 
separated in the normal coordinates, and then we apply them to 
the discussion, of forced motion. 

85. Fourier Analysis in Function Space. — When we come to 
the question of satisfying initial conditions, and of Fourier's 
series, we meet immediately close connections with function 
space. Fourier's theorem, stated for sine series,, can be put in 
the following form, by introducing terms \/2/L: 

f(x) = ^ (VL/2 B n ) s/2/L sin 



VL/2 B n = \ f{x)s/2/L sin ^ dx. 
Jo L 


Now the functions \/2/L sin (mrx/L) are the unit vectors in func- 
tion space along the directions representing the overtones of the 
problem — functions which are often called the normal functions, 
or characteristic functions, of the problem. Thus Eq. (7) is 
just like a vector equation, stating that a vector (f(x)) is the sum 
of unit vectors [V2/L sin (mrx/L)] each multiplied by the com- 
ponent of the vector along the corresponding axis (y/L/2 B n is 
the component of f(x) along the nth axis) . To find these com- 
ponents, we need only project the vector f(x) on the corres- 
ponding unit vector, which means taking the scalar product. 
But the scalar product, as we have seen, is ah integral, 


VVL sin V^~\ = v^/2 B n = 

f/CaOv^/Lsin^dz. (8) 
Jo L 

Thus the formulas of Fourier's method have the simplest possible 
vector interpretation in function space. But we can also see 
that, if we had some other set of normalized orthogonal functions, 
we could proceed with an expansion in an analogous way. It is 
worth noting that by using Fourier's method, we can solve for 


the normal coordinates in terms of u(x): since u(x) = ^ -\/2/L 

n = l 


. mcx 

sm T~ 

4> n , it is obvious that <t> n = I u(x) y/2/L sin -=- dx, 

the component of u(x) along the nth axis. 

86. Equations of Motion in Normal Coordinates. — To find the 
equations of motion, we must set up the Lagrangian function. 
Let us first write for the velocity of the string 

= <f> x y/2/L sin j- + 4>2 V2/X sin -=- + • • • + 


Li U 


0„ v 2/L sin T 

and proceed to the expressions for the potential and kinetic 
energies. We have 

/ oo \ 2 

\n = l 


. HTX \ 7 

sin -J— I dx 

n = l 

since all the product terms disappear because of the orthogonal 
properties of the normal functions. Thus Tx becomes reduced 
to a sum of squares in the generalized velocities and the integra- 
tion over x leads to the result 

^1 = 12^'- (9) 

n = l 

In a similar manner we set up the expression for V, the potential 
energy. We have 

* - 1 f (9 2 * - \ f (! «■ ^ t - n 4) *> 

which we treat exactly as in the case of the kinetic energy and 
obtain V as a sum of squares of the generalized coordinates 0„, 

v = £2£**- (10 > 

n = l 

Using the Lagrangian equations of motion, we have 

U = T, - 7, 


d/dL x 
dt\d<j> r , 

d(dL\ x 

1 _ _(^' 2 

so that the equations of motion become 

aL : - -(r)' 1 *- 


M<£» + I -f ) T<f> n = $ n , n = 1, 2, • • • , (11) 

where <£„ is the generalized external force corresponding to the 
coordinate <f> n . Up to this point we have considered only free 
vibrations for which $„ is zero, but now we have generalized our 
problem to include such things as forced oscillations. We solve 
the equations above for the case of free vibrations, obtaining 


_mr It 

,n ~ ttVm 

so that our expression for u is just the one originally found, in 
Eq. (3), Chap. XII. It is thus clear that we have essentially 
used normal coordinates in our first discussion of the vibrating 

The generalized force $„ is defined so that the work done by 
the external force during a displacement d4» n is $ n d<]> n . During 
a displacement d<f> n , the corresponding displacement of the string 

is a„ sin -^-d<t> n = du, so that if the force acting on a length of 


string dx at time / is fdx, we find for <l> n , 

*„ = V2jL J Q f sin ^ dx. (12) 

In function space, this is evidently simply the component of $ 
along the nth axis. An interesting case occurs when the force 
acts practically at a point x — a, such as when a violin string is 
plucked or bowed. We then write 

<K = V2/L sin ^ l f dx - F V2/L 


sin — =r-- 


This expression brings out the advantage of the concept of 
generalized force. For example, if a string is struck or bowed at 


its center, then a = L/2, and $„ = when n is an even integer. 
This means that this force can have no effect on the even over- 
tones and can only affect the odd overtones. If the string is 
originally at rest, no matter what kind of force is applied at the 
center, only odd overtones appear in the resultant vibration. No 
even overtones ever occur, as the normal coordinates are 
uncoupled and each normal coordinate behaves just as if all the 
others were absent. These conclusions, immediately obvious 
from the expression for <S> n , are not at all obvious if one considers 
only the usual force acting at a point of the string. 

Another case of interest occurs when a periodic force acts on a 
point x = a of the string. We then have 

$ n = F s/2/L sin -y- cos cot 

and the equation of motion for <j> n is very much like the equation 
of forced motion of a one-dimensional oscillator. The solution 
of this equation is then 

. N F VtyL sin (mra/L) , •"- 

<pn = A n COS (C0 n c ~ in) ~\ 7 5 S\ COS COl. 

H{0) n Z — CO 2 ) 

The first term is the solution of the homogeneous equation and 
represents the free vibration of this mode, and the second repre- 
sents the forced vibration indicating all the characteristics of 
resonance which we have previously studied. 

87. The Vibrating String with Friction. — Thus far we have 
neglected friction forces which must act in real eases. Let us 
assume that the motion of our string is opposed by a frictional 
force such that the force on each element of the string is propor- 
tional to its velocity. The partial differential equation of the 
free motion of the string becomes 

d 2 u ■ 7.3w _ T d 2 u 

+ k^ = -~- . (13) 

We can treat this problem rather simply by noting that there is a 
function G, called the dissipation function, which is one-half the 
rate at which energy disappears from the system and which has 
just the same form as the kinetic energy T\. In fact, we have 

G = - iik I u 2 dx. 
2 Jo 


One can easily show that the Lagrangian equations when there 
is a dissipation function become 

d/dLA _ dLt ,dG_ Q (u , 

dt\ dq-J % + dq, " Wl * U ; 

According to the special law of friction we have assumed, the 
dissipation function has the same form as T\, so that if we intro- 
duce the normal coordinates fa, fa, . . . etc., which we found 
reduced the expressions for T\ and V to sums of squares, they will 
also do the same for G, so that we can separate the equations of 
motion for each coordinate fa. Proceeding as in the last para- 
graph, we find 

G = y /*}<i>n 2 

The equation for fa then becomes 

fa + kfa + 0) n 2 fa = -*» (15) 


which is the same form as in the case of a one-dimensional damped 
oscillator. From this we see that each of the overtones has the 
same logarithmic decrement, so that in a free vibration the vari- 
ous overtones maintain their relative amplitudes. 

In the case of a forced vibration caused by a periodic force F 
cos cot acting at the point x = a, we have 

<& n = F\/2/L sin -y- cos cot, 

and the steady-state vibrations are given by 

r^~FF • nira[ (co„ 2 — ca 2 ) cos cot + "& sin cot \ ... _ N 

2 / Lsin TrL — k'-^' +'(■*)' — } (16) 

This is, of course, essentially the same solution we obtained in 
the discussion of a one-dimensional oscillator. 

The particularly simple solutions just obtained depend entirely 
on the simple form of the law of friction we have assumed. In 
general, for vibrating systems, the presence of frictional forces 
does not prevent us from setting up the kinetic and potential 
energies as a sum of squares. But this transformation will in 
general not transform the dissipation function G to a sum of 


squares. Only in very special cases, such as the law of friction 
assumed above, does the transformation also reduce G to a sum 
of squares. The general equation of motion for the coordinate 
<£i, for example, will be of the form 

ai0i + ci4>i + £fbu<i>i = $1 

instead of the simpler form obtained above 

in which we have only <j>i appearing. Thus in the general case 
of frictional forces there is coupling between the various coordi- 
nates so that we have much more complicated types of motion. 
In Chap. XI, Prob. 3, we had such a case with two coupled 
circuits with resistance and found that we could get a simple 
solution only for very small frictional forces. 


1. Write down the Hamiltonian function for a vibrating string, using 
normal coordinates. Set up Hamilton's equations, and show that they 
are satisfied for the solution we have found. 

2. A sinusoidal force of constant amplitude but adjustable frequency 
acts on an arbitrary point of a string. The string is in addition damped by 
a frictional force proportional to the velocity. Discuss the resonance of 
the string to the force, computing, for example, the total energy of the 
string as a function of the applied frequency, and showing that the resulting 
resonance curve goes through maxima corresponding to the various over- 
tone frequencies. Find approximate heights and breadths of the maxima. 
Neglect the transient vibrations. 

3. Prove the orthogonality relations for the normal functions for the 
weighted string; that is, prove 

sin -t— sin — j V ■ ■ • + sin — j- sin —f— = 0. 

4.. Using the orthogonality relations of Prob. 3, and the analogy of the 
continuous string, set up a method for finding the amplitudes of the various 
overtones of the weighted string, in terms of the initial displacements and 
velocities of the particles. 

5. Apply the method of Prob. 4 to the special case of two coupled par- 
ticles, as taken up in Prob. 1, Chap. XI. 

6. Apply Prob. 4 to the case of four particles, as in Prob. 1, Chap. XII. 

7. Consider two coupled mechanical vibrating systems, with friction. 
In general, a dissipative function cannot be set up, and the problem of the 
motion cannot be solved exactly. Show what relations the frictional forces 
must satisfy in order to have a dissipative function. Write-down the 
corresponding relations also for the electrical case. 


8. What sort of force must be applied to a string in order that the forced 
motion should be a pure vibration of the nth harmonic? 

9. Consider the case of two coupled particles as in Prob. 1, Chap. XI. 
Show that if equal external forces act on both, the overtone in which they 
vibrate in opposite directions can never be excited. 

10. In the case of the two coupled particles of Prob. 1, Chap. XI, assume 
that at t = both particles are at rest, but that one particle is displaced a 
distance d, the other not being displaced at all. Find the amplitudes of the 
two overtones, writing down the formulas for the displacements of each 
particle as functions of time. 



In the last two chapters, we have considered the problem of 
the vibration of a string of constant density and uniform tension. 
These results may now be extended for the more general case of 
variable tension and density. We shall not be able to carry 
through the results in complete detail; for, as we shall see, we are 
led to a more complicated differential equation, which we cannot 
solve in general. But we shall find that the theory of expansion 
in orthogonal functions, and all the general relations, go through 
just as with the uniform string, so that we can derive a good deal 
of information. We shall also develop perturbation methods, 
which can be used when the tension and density have only small 
deviations from constancy. 

The importance of the problems considered in this chapter 
arises more from what they suggest than from the specific 
problems considered. Strings of variable density are of small 
practical importance. But the string is the simplest case of a 
vibrating continuum. Waves in three dimensions resemble 
waves on a string. A string of variable density resembles an 
optical medium of variable index of refraction, and we meet 
problems of reflection and refraction. Many three-dimensional 
problems can actually be reduced to one-dimensional cases, and 
these are all likely then to take on just the character of our string 
of variable density. It forms, so to speak, the type for much of 
our more complicated work. In wave mechanics, for instance, 
most of our problems reduce to a mathematical form which is 
identical with that of the present chapter. The perturbation 
theory we develop in this chapter is one set up originally for use 
with variable strings, yet it has had most important effects in 
the development of the quantum theory. 

88. Differential Equation for the Variable String. — We set up 
the differential equation of motion exactly as we have done in 
Chap. XII. In calculating the resultant force on an element dx of 

our string we found (ifj)^ - (if^.andtimJs^lf^l*, 



which reduces as before to T-^dx for constant tension. The 

remainder of the derivation proceeds as before, and the equation 
of motion becomes: 

where both T and /x are now functions of x. If we assume that 
u is proportional to a function of x times e iwt , we find that we get 
an equation for the function of x alone : 

|(^) + .V« W =0, (2) 

where this u(x) is the part of u depending on x. 

89. Approximate Solution for Slowly Changing Density and 
Tension. — The above Eq. (2) is a linear second-order differential 
equation with variable coefficients, on account of the functions 
T and /*, which depend on x. We can give no general method of 
exact solution, except the power series method. To apply that, 
of. course, T and /* must be expressed as power series in x. But 
it turns out that the solutions of the equation are not very differ- 
ent from sines and cosines of x, and a very useful approximate 
method of solution is based on this fact, good when the density 
and tension do not change by a large fraction of themselves in 
one wave length. This approximate solution is simple, and 
forms a convenient method for discussing the equation qualita- 
tively. The effect of the variable density and tension comes in 
two ways: first, the wave length depends on the position, and 
second the amplitude depends on x. Thus, instead of A sin 

7 ^-, as with the uniform string, the actual solution for the func- 

tion of x can be at least approximately written in the form u — 
A(x) sin B (x). We can see easily the form which B must have 

for the nonuniform string. For plainly ^—= must 

measure the number of wave lengths between x\ and x 2 , on 
account of the way in which B appears in the sine function. 
But now if X is the wave length, regarded as a function of x, 
dx/\ is just the number of wave lengths in distance dx, so that 

J* X2 dx 
—, from which evidently 
XI ^ 


B(x) = 2x / dx/\. Since the wave length can also be written 
2tt/\ = coy/n/T, this is equivalent to B(x) = to J s/yjT dx. It is 

not hard to show that if we set A = — tt=—> the resulting 


A .„ constant . r /— 7™, 
A e lB = — e io,)V*/Tdx } (3) 

or the corresponding real quantity 

constant , r 
cos (coj " 


forms an approximate solution of the differential equation. 

To prove this equation, we may proceed as follows : we assume 
the solution 

u = A e^JVi^^ 

where A is an undetermined function of x, and substitute in the 
differential equation. When the necessary differentiations and 
substitutions are performed, we obtain a differential equation 
for A, which may be written, after a little manipulation, 

Ul d * A 1 dT 1 dA\ 
\A dx 2 + f dx A dx) + 

where X = — */— , the wave length of the disturbance. Now 

0} \ n 

we are assuming that /*, T, and consequently A, do not change by 

a large fraction of themselves in a wave length. Thus the quanti- 

1 dA 
ties like X-r -t-> measuring the fractional change in A in a wave 

length, are numbers small compared with 1. Their squares, 
then, and their rates of change in one wave length, can be 
neglected, and that means that the first set of terms above, in 
X 2 , can be neglected in comparison with the second set, in X. 
Considering only the latter terms, we can rewrite the Eq. (4) 


4 \( d In T din A 
+ 4\ dx + dx ) ' 

= 0, A(jtT)H = constant, 


giving the solution we wished to prove 


90. Progressive Waves and Standing Waves. — In the problems 
of Chap. XII, we noted that there were two sorts of waves 
possible in a uniform string: progressive waves, and standing 
waves. The progressive waves traveled along with a velocity v; 
an example was cos u>(t — x/v), in which the displacement has the 
same value at all points for which i — x/v = constant, or x = 
vt + constant, points traveling along with velocity v. Similarly 
in our general case, we can set up a complex solution 

constant iu y~jT) 

where v = y/T/n. The real part is 

cos col 

(■ - /') 


where the equation t — I — = constant gives, by differentiation, 

dx/dt = v, verifying that the velocity of propagation of the 
progressive wave is v = s/T/n, varying from point to point 
along the string. Thus in the general case we can have a progres- 
sive wave along the nonuniform string. We shall see later in the 
chapter, however, that this is only approximately true for strings 
with slowly varying density and tension. At a rapid variation 
of constants, a reflected wave is set up, traveling in the opposite 
direction, and the superposition of direct and reflected waves 
eventually produces something more like a standing wave. 
An example of a standing wave with a uniform string is 

... x . . constant . . . Cdx 

sin o3i sin w-> or in the general case — A . — . sin cor sin to I — 
v & ^/nT J » 

This is a product of a function of t and a function of x, so that such 

a wave has nodes, values of x for which the function of x is always 

zero, so that the vibration always has zero amplitude. We have 

seen that by combination of two progressive waves we can build 

up a standing wave; similarly by adding two standing waves 

we can get a progressive wave, as we see from the relation 

cos co£ cos co I h sin co* sin co I — = cos col t - I — )• 

J v J v V J v/ 

Thus either sort of wave satisfies the differential equation, and 

we can add solutions as we always can with homogeneous linear 

differential equations. 


Now suppose a string is held at one point. That means that 

we must limit ourselves to a particular set of solutions of the 

differential equation: the standing waves which have a node at 

that point. Thus in our approximate solution, we must take 

the space function 

constant . C x dx 
sin oj 

where x is the point where the string is held. Suppose we 
imagine a semi-infinite string, held at one point, with a wave 
train of finite length approaching the end. The wave is reflected 
from the end, travels back, and the superposition of the two 
trains, in opposite directions, forms the standing wave. This 
wave will have nodes at definite points on the string. It may- 
have any arbitrary frequency, but the nodes will be differently 
spaced with different frequencies. 

If now the string is held at two points, instead of one, we meet 
a difficulty: with an arbitrary frequency, the string will not 
have a node at the second point. We must limit our frequency to 
one of the discrete set for which there are nodes at both ends. 
Thus the fact of having the string held at both ends automatically 
sets up a discrete set of possible frequencies of vibration, the 
overtones, with a particular form of vibration for each. We let 
the nth overtone have a wave form represented by u n (x), an 
angular frequency «». Thus the whole solution may be written 

u = V(A„ cos w J + B n sin wj)u n (x), (5) 


where the constants A n and B n are chosen to satisfy the initial 
conditions at t = 0. If our analytic approximation to the 
function is good, we have 

, . constant . C x dx fa \ 

u " (x) ' "W sm ""J,^' {) 

with v = VT/fjL. Since the displacement is zero not only at x , 
but also at the other end xi, we must have 

COn I 

X1 dx rt n 

--2*1 (7) 

where n is an integer, which as we readily see equals 1 when there 
are no nodes between the ends, 2 when there is one node, etc. 
This leads at once to the condition 





for the angular velocities, which for the uniform string reduces to 
where L = xi — x is the length of the string. If 

Wn ~ T V 7 

our analytic approximation to the functions u n is not good, we 
must simply choose those particular functions for our w n 's which 
have nodes at Xi and £ 2 , labeling them in order, the one with 
n — 1 nodes between the ends being called u n> and then must 
find the angular frequencies connected with these particular 
functions. We meet such a case, for example, in some of the prob- 
lems, where the functions u n are Bessel's functions, and where we 
simply must look up the nodes in tables of the roots of Bessel's 
functions. The particular functions u n satisfying both differ- 
ential equation and boundary conditions are called normal func- 
tions, or characteristic functions, or wave functions, and the 
frequencies co n are sometimes called characteristic numbers. 

91. Orthogonality of Normal Functions. — We can now prove 
easily, and quite generally, that the normal functions u n are 
orthogonal. For this purpose we consider two normal functions 
u n and u m , which are solutions of the differential equation. We 
then have the identities 




We multiply the first equation by u m , the second by u n , subtract 
one from the other, and then integrate over the string, which we 
assume to extend from x = to x = L. We thus obtain 

n-=( r £)—s(*£)]*'- 

= k 2 — co n 2 ) 

I n(x) U n U m dx. 

The left side integrated by parts yields immediately 

\ T\u — — u ^A1| L _ f L rr( dUn dUm — ^n dUn \ d 
[_ \ m dx dx /J|o Jo \dx dx dx dx J 


The integral obviously vanishes, and the integrated part vanishes 
since both u n and u m are zero for x = and x = L. In general 
the integrated part would vanish if either u or du/dx vanished 
at the boundaries, or if an expression of the form u + a du/dx 
vanished at each boundary. Thus the right side of the equation 
above yields us as the analogue of our former orthogonality 

I n(x) u n u m dx = 0, if n 9^ m, (9) 

since, when n = m, the integral need not vanish to satisfy the 
original equation. We shall assume the functions to be nor- 
malized so that 

CV*) u n * dx = 1. (10) 

In the previous chapter, where the density n was independent 
of x, we could simply omit that factor in the integrals, changing 
the normalization condition to Jw 2 dx = 1, without any error 
other than a change of a constant factor in the functions u s . 
Here, however, the density factor must be kept in. We can see 
the analogy to the corresponding situation with the two coupled 
particles. There, if the masses of the particles were m h ra 2 , 
and their displacements were y h y 2 , we had to set up new quan- 
tities xi, xi, equal to a/wiI/i and Vwfi, respectively. We 
could give the normalization conditions by stating, for example, 
that the unit vector along X has unit magnitude. The coordi- 
nates of the extremity of this vector, in the notation of Chap. XI, 
were xi = a', x 2 = ft '. Squaring the magnitude of this vector, 
we had the normalization condition 

Xt 2 + X2* = «' 2 + /3' 2 = 1. 

But this is equal to m x y^ + W22/2 2 , where the y's are the actual 
displacements. Thus in that case, just as here, we must weight 
the squares or products of displacements, where they appear 
in the orthogonality or normalization conditions, with the respec- 
tive masses. Here the term n(x)dx is just the mass of the ele- 
ment dx, so that the analogy is complete. 

92. Expansion of an Arbitrary Function Using Normal Func- 
tions. — We have seen that we can write our solution 

u = V(A„ cos w n t + B„ sin <aj)u n (x). 


If the initial conditions are u(x, 0) = fix) and -^(x, 0) = Fix), 

where u(x, t) is the function of coordinate and time, we have, 
substituting in our general solution, 


F(x) = ^BnUnlln, (11) 


and we have the general problem of expanding an arbitrary func- 
tion in a series of normal functions, very much like our previous 
problem of expanding an arbitrary function in a Fourier series. 
As before, we shall content ourselves with showing that we can 
find expressions for the coefficients A n and B n which formally 
satisfy this type of expansion. The remainder of the problem, 
showing that the series so built up really represents the function 
and that it converges, will not be taken up here. -It is sufficient 
to say that such proofs can be given. 

Let us multiply each of Eqs. (11) on both sides by fi(x)u m , and 
integrate from x = to x = L. Ws thus have 

f nix) u m f(x) dx = ^£fA n J nix) u m u n dx 


f n(x) u m F(x) dx = 2yO>„ B n f n(x) u m u n dx. 

On the right side of each of these equations each term for which 
m 9* n vanishes because of our orthogonality relations. The 
remaining term contains an integral which has the value unity 
if the functions u n are normalized. Thus the whole sum reduces 
to A m (or in the second equation to u m B m ), and we have found 
expressions for our coefficients : 

Am = J n(x) f(x) u m dx, 

B m = — I /i(x) Fix) u m dx. (12) 

w m jo 

It is clear that our discussion of the Fourier expansion is but 
a special case of the general one here discussed. The most con- 
venient point of view to take is to define the scalar product of 
two functions f(x) and <£(a;) as 

j£ n(x) f(x) <t>(x) dx. 


Then clearly our orthogonality and normalization conditions 
are just what we should expect from our discussion of orthogonal 
vectors in function space, in the last chapter. The rotation 
of coordinates in function space again separates variables, 
as it did in the case of the uniform string; but now the separate 
normal or characteristic functions are more complicated in 
form, as we see from the more complicated differential equations 
they satisfy, though they still vibrate sinusoidally with time. 
When we carry out an expansion of a function f(x) in terms 
of the characteristic functions, the coefficients, as with the 
Fourier expansion, are just the scalar products of the correspond- 
ing characteristic functions with the given function, or 

J q fi(x) f(x) u n dx, 

as we wrote above. 

93. Perturbation Theory. — One approximate method of inte 
grating the differential equation of the nonuniform vibrating 
string has already been indicated, making use of the resemblance 
of the actual functions to sines and cosines. An entirely differ- 
ent approximate method, the method of perturbations, is also 
frequently useful. This is a method which applies if the problem 
is very nearly a soluble one, the density and tension varying 
only slightly from their values in the soluble case. The usual 
application is to an almost uniform string. For simplicity we 
consider only the case where the tension T is a constant, while 
the density is a function n(x), almost equal to no(x), for which 
the problem can be solved. We assume that we know the char- 
acteristic functions u n ° and frequencies w n ° for the soluble case, 
satisfying, therefore, the differential equations 

T^- + co n °Vo(*K° = 0. (13) 

We now remember that the functions u n ° form an orthogonal 
set, and that any arbitrary function can be expanded in series of 
such functions. Thus in particular the nth characteristic func- 
tion u n of the real problem can be so expanded : 

u n = ^i„fc«t°. (14) 


We may regard our problem as that of determining the constants 
A n k. Considered in function space, this problem is very simple. 


The functions u k ° form one set of orthogonal unit vectors, the 
u n 's another, and these equations merely express one set in 
terms of the other; they are the equations for a rotation of 
coordinates in function space, from the axes characteristic 
of the "unperturbed" problem with density /t to the "per- 
turbed" problem with density /x. 

The easiest way of getting at the conditions for rotation 
is simply to substitute u n in the differential equation which we 
wish it to satisfy, 

If we do so, and use the differential equations which w n °'s satisfy, 
we have easily 


Now we may multiply by an arbitrary u m °, and integrate from 
to L. Remembering that the w°'s are orthogonal, the result 

2)A nfc (cO fc °V°m fc - «nV«*) = 0, (15) 


where n° mk = J^ote) ™ m ° u k ° dx = 1 if m = k, if m ^ k, and 
Hmk = f L n(x) u m ° u k ° dx, a quantity differing from n° m k only by 
small quantities of the order of the deviation between n and n . 
We have here an infinite set of simultaneous homogeneous linear 
equations (w can take on any value) for the unknown constants 
A nk . These can be written, for a given n, 

A nl (0)l° 2 - CdnVll) + A n2 (-0VW + An8(-«»W + * * * =0 
A„l(-C0 n 2 iU2l) + A n2 (c02° 2 ~ C0 M 2 /X2 2 ) + * ' * - =0 

A„l( — C0 n 2 /i3l) + * * ' ' = 

. ... =0. 


In general these will have no solutions; the condition for existence 
of a solution is that the determinant of coefficients vanish. This 
forms an equation for co n 2 , called a secular or determinantal 
equation, and just analogous to that which we found with the 
problem of two coupled vibrations, when we made a rotation 
of coordinates, and we recognize it as the general type met in 


such problems. In this case, the equation has an infinite number 
of roots, one near each unperturbed frequency. 

It is hardly feasible to solve the determinantal equation 
directly, though it is not hard to make an approximation to it. 
It is easiest, however, to proceed directly from the linear equa- 
tions. If the w°'s are nearly the same as the w's, it is plain that we 
shall have A n k = 1 almost, if n = k, or = almost, if n ?£ k. 
The only term in the equations which is large and need be 
considered is then that for which n = k (so that A nk will be 
large) and simultaneously m = k (so that n° mk and n m k will 
be large). This term gives 

o 2 
A nn (a) n ° — u„V«) = 0, or o) n 2 = — ^-- 


If now ix = /xo + /ii, where mi is small compared with mo, we have 
Vnn = 1 + I mi Un° 2 dx, so that, using the first term of a binomial 

co„ 2 = uj 2 (l - J^mi U n ° 2 dx}, (17) 

correct to the first order of small quantities, but neglecting terms 
of the order of the square of the integral of ml It is not hard 
to get expressions of the same order of accuracy for the A's. 

94. Reflection of Waves from a Discontinuity. — We mentioned 
earlier that a progressive wave striking a discontinuity of density 
would be partly reflected, and only partly transmitted. It is 
easy to solve exactly the problem of propagation of the wave 
over the discontinuity, and as this is one of the exactly soluble 
cases of the vibration of the nonuniform string, and is the simplest 
problem of reflection, it is worth carrying its discussion through. 
Let us assume two uniform strings of different densities attached 
to each other and subject to the same tension T. Let the first 
string have a linear density mi and the second a density M2. 
We shall take the point of junction as x = 0. We thus have 
different velocities of propagation Vi = -y/T/ni andv 2 = \/? 7 /m2 
in the two strings. We may also define an "index of refraction" 
of one medium with respect to the other as n = Vi/v^ = VM2/ML 
At x = we must satisfy certain conditions at every instant 
of time. First, the displacement u must be continuous across 
the boundary if the strings remain joined together, and secondly, 
the slope du/dx must also vary continuously across the boundary. 


Were the latter condition not fulfilled, we would have the impossi- 
ble situation of a finite force acting on an infinitesimal piece 
of the strings at the junction. 

Let us consider a harmonic progressive wave in the first 
string (ah) impinging on the junction. In the second string we 
shall have a wave traveling in the same direction as the impinging 
wave, but in order to satisfy the boundary conditions, we must 
assume a reflected wave in the first string. Thus 

Ul = Ae V Xl ' + Be \ *»/ 
. and 

U2 = Ce v _ x »'. 

The frequency is a fixed characteristic of the wave, independent 
of the medium in which the wave is propagated. The wave 
lengths Xi and X2 are related by the condition 

= 1} = 2!?, 

Xi X2 

x 2 

= n. 

At the junction, where x = 0, we have 

(t*i)o = Ae 2 * iyt + Be 2irivt 
(w 2 )o = Ce^ ivt , 

\ dx J Xi Xi 

/duA = _27rz (7e2xij , t 

Thus the conditions of continuity give 

A + 5 = C, 

A _ £ = C 
Xx Xi X 2 

5_X 2 — Xi__n— 1 

A Xi + X 2 w + 1 

giving the ratio of the amplitude of the reflected to the incident 

wave. Two limiting cases are interesting: if ju 2 = <x>, so that 


the junction is held fast, we have n = «, B = —A, or the wave 
is entirely reflected, with a change of phase. The other case is 
M2 = 0, the junction is free, and we have n = 0, B = A, reflec- 
tion again being complete, but with no change of phase. In 
both these cases the incident and reflected waves combine to 
give standing waves. 


1. A heavy uniform flexible chain hangs freely from one end. The chain 
performs small lateral vibrations. Show that the normal functions are 

u n = Jd-~\/xY where J represents the Bessel function of order zero; 

a; is the distance from the bottom of the chain to any point, g the acceleration 
of gravity and o?„ is the angular frequency of the nth. mode of vibration. 
For a chain 8 feet long, find the periods of the first few modes of vibration 
(use Jahnke Emde's tables to get the roots of the Bessel functions). 

2. One end of a uniform flexible chain of length I is attached to a vertical 
rod which rotates at a constant angular velocity Qo- Neglect the effect of 
gravity, so that the chain stands out horizontally under the tension of 
centrifugal force. Show that the differential equation for small vibrations 
transverse to the length of the chain is 

fak -*■>£]+*"-* 

Introduce the variable y = x/l, and solve the resulting equation by the 
power series method. The boundary conditions are w(0) = and u for 
y = 1 must remain finite. Note that the latter condition can only be 
fulfilled if the series breaks off to form a polynomial. Calculate the first 
three polynomials and derive a relation for the frequency of the nth mode 
of vibration. The polynomials so found are the Legendre polynomials of 
odd order. 

3. A string stretched with a uniform tension T, and with a density a/x 2 , 
is held at the points x = Xi and x = x%. Solve the equation, using the form 
u = s/x z, and show that the general solution is 

u = Ax i+ik + Bx*- ik , 

where k is defined by k 2 + yi = u> 2 a/T, and w is the angular velocity. 
Show from this that the general form of the normal function is 

/- . W7T In (x/xi) , 

■\/x sin -: — t — T-^-' n = 1, 2, 6, 
In (xj/xi) 

and that 



4 (In Xi/xi) 2 l 

4. Solve the differential equation of Prob. 3 by the approximate method 
described in this chapter, and show that the solution has the same form as 
the exact solution. Show that the two solutions coincide in the limit of 
large a. 


6. A progressive wave travels on a uniform string which at x = is 
connected to a string whose density w> m = Mo 4-«av This second string 
is connected to a third at x = I which has the constant density m = mo + od 
and the whole is stretched with a uniform tension T. Using the approximate 
method, find the ratio of the amplitude of the wave transmitted in the third 
string to the original amplitude of the incident wave in the first string. 

6. Consider a string of uniform density m, length L, but with a tension T 
which varies slightly from its average tension T . Show with the help of a 
perturbation calculation that the angular frequency of the nth mode is 
given approximately by 

L 2 n \ mrToJo dx L L ) 

.X 7. A uniform string of density mo, tension T, has a small load m placed 
at a; = a. Show that the frequency of the nth mode of vibration is approxi- 
mately given by 

2 = VlEI 2Yi - — sin 2 —} ■ 
L 2 mo\ Moi< L / 

Show that the effect of the additional load vanishes if it is placed at a node, 
and is biggest when at an antinode. 

8. Show that the differential equation of Bessel's function J m is the same 
as that for a string of tension T = x, iw 2 = x - m 2 /x. Using the approxi- 
mate method developed for the vibration problem, show that approximately 

JM = 4 T Stant cos (JVl - m 2 /* 2 dx - «), 
Va; 2 — m 2 

where x > m. 

9. Using the approximation of Prob. 8 for J and Ji, compute the approxi- 
mation functions for a number of values of x, and show by a table of values 
how well these agree with the correct functions. Choose the arbitrary 
amplitude and phase factors to make the functions agree with the values of 
Jo and J i in the tables, for example making the zeros agree by adjusting a, 
and the maxima by adjusting the amplitude, taking such values as to get 
the best agreement possible for large z's. 

10. Derive the differential Eq. (4) for A, in the approximate solution 

. iafy/n/T dx 
u = Ae 



The problem of a vibrating membrane is very little more 
difficult in principle than the string. Let us take two coordinates, 
x and y, in the plane of the membrane, writing u for the displace- 
ment at right angles to the plane, so that we wish a, relation 
u = u(x, y, t). Consider a small element of the membrane, 
bounded by dx and dy. Let the mass per unit area be n, so 
that the mass of the element is ndxdy. Then its mass, times 
acceleration normal to the membrane, is n dx dy d 2 u/dt 2 . This 
is equal to the force arising from the tension. Let the tension 
be T. That is, if we cut the membrane along any line, the 
material on one side of the cut exerts a force on the material 
on the other, normal to the cut and equal to T for each unit 
of length of the cut. We assume that T is constant over the mem- 
brane. If the membrane were plane, the tension on its opposite 
edges would cancel, and we should have .no resultant force. 
If it is curved, however, we may proceed as follows. Along 
the edge at x + dx, the tension is at right angles to the y axis, 
almost along the x axis, but with a small component along the 

u direction, equal approximately to T[ — ) per unit of length, 

\OX/ x+dx 

or this times dy for the actual length dy. Similarly along the 


edge at x the component is — Ti-^-j dy, so that the sum is 

approximately T d 2 u/dx 2 dx dy. The forces acting along the 
edges at y and y -\- dy similarly add to T d 2 u/dy 2 dx dy, and the 
total force, the sum of these, is T (d 2 u/dx 2 + d 2 u/dy 2 ) dx dy. 
Thus the differential equation, dividing by dxdy, is 

d 2 u T fd 2 u d 2 u\ r 

^ = T \dx~ 2 + W (1) 

95. Boundary Conditions on the Rectangular Membrane. — 

A membrane is ordinarily held fast around a certain curve. 
In this way one can get a great variety of problems, by taking 
different curves. The two simplest are the rectangular mem- 



brane, and the circular membrane, or ordinary drum, and in 
the present section we consider the rectangular case, assuming 
the membrane to be held at x = 0, x = X, y = 0, y = Y. 
We solve first by the exponential method, assuming 

1/ = f,i(at+kx+lv) ^ 

Then the differential equation becomes — /wo 2 = — T(k 2 + I 2 ), 
co = \/T(k 2 + l 2 )/n, giving the angular velocity of the vibration 
in terms of the quantities k and I. Instead of the exponential 
solution we can equally well use sines or cosines. For example, 
with a given co, k, and /, we can take 

U = gioit/gikx+ily g— ikx+ily gikx—ily _|_ e —ikx—ily\ 

= e ib>t {2i sin kx)(e ilv - e~ ily ) 
= — 4e'"' sin kx sin ly. 

As a matter of fact, this solution with sines is the one we want, 
since it reduces to zero when x = and y = 0. To apply the 
condition when x = X and y = Y, we must make the sines zero 
at these points, or must have sin kX = 0, sin IY = 0, or k = 
mr/X, I = mir/Y, where n, m are integers. In terms of these 
constants, we can then write 


so that instead of having overtones whose frequencies are integral 
multiples of a fundamental, the frequencies are given by a much 
more complicated relation. There is one interesting result of 
this. Pleasing musical notes depend on having the frequencies 
of the overtones related in simple ways to the fundamental, so 
that they sound well together, as with a vibrating string. In a 
membrane or drum, in which these relations do not hold, the 
sound is far less musical than with a string. This suggests other 
cases, which do not exactly fall within the category of the present 
chapter. For example, a vibrating bell acts as a two-dimensional 
vibrating system, a little like a membrane, and has complicated 
overtones which in general are not harmonics. But it has been 
found by trial that if bells are made in their conventional shape, 
overtones are so adjusted that the loud ones are actually in tune 
with each other, though a slight change of shape would destroy 
the quality. 


96. The Nodes in a Vibrating Membrane. — If the membrane 
is vibrating with one overtone, the amplitude will be zero along 
certain lines, which will stay at rest. These nodal lines form a 
rectangular network, coming when nx/X = 1, 2, • • • n — 1, 
and for my/Y = 1, 2, • • • m — 1, At any instant, if the 
membrane is displaced upward in one rectangle, it will be dis- 
placed downward in all adjacent rectangles. Such a nodal 
arrangement is characteristic of all sorts of standing wave 

97. Initial Conditions. — At t = 0, we may wish to fix the shape 
and velocity of our membrane, obtaining initial conditions of 
the sort found with the string, and leading as before to Fourier 
series. For example, suppose the initial velocity is zero, the 
initial displacement a function f(x, y) . Then we must have 

"^n a x • mr% ■ wry /on 

u = >• A nm cos Wnmt sin -y sin -~) (3) 


where ca nm is given in Eq. (2), and where we have 

fix, y) = JSA nm sin -y sin -^- (4) 


To find the coefficients A, the amplitudes of the various overtones 
necessary to satisfy the conditions, we must expand the function 
f(x, y) in a series of products of sines — a double Fourier series, as 
it is called. As in the last chapter, we assume that the expansion 
can be carried through, and ask only for the values of the coeffi- 

cients. Multiplying both sides of the equation by sin —==- sin 

y > where n', m' are definite integers, we integrate with respect 
to x from to X, and with respect to y from to Y. We find as 

before that I sin -^r- sin — ^- dx is zero unless n = n', and is 
Jo A A 

X/2 if n = n'. Thus the final result is 

Jo Jo 

It is worth noting that this is the first time we have had to use a 
double integral. If f(x, y) is a complicated function of the 
coordinates, it can, of course, be a very difficult problem actually 
to evaluate the integral. 

An'm' = YY I Jo ^ X ' ^ Sln ^1T Sln ^T^ dX dy ' ^ 


98. The Method of Separation of Variables.— To solve our 
differential equation, we may adopt a slightly different method, 
called the method of separation of variables, which does not 
directly depend oh the use of exponentials. It is a method for 
reducing the partial differential equation to a set of ordinary 
equations, and we shall find it very useful. In fact, it is so valu- 
able that practically the only partial differential equations which 
can be solved at all are those for which this method can be used. 

We wish to solve ^ = -( -^ + 3^2 )' Suppose we try to 

find a solution u which is the product of a function of x, a function 
of y, and a function of t; say u = P(x)Q(y)R(t), where P is a 
function of x to be determined, and so on. Of course, it is not 
obvious that one can find such a solution, but our experience 
would lead us to try it. If we substitute, we have, for example, 
du/dt = PQ dR/dt, and so on. If we denote dR/dt by R', with 
corresponding notation, we then have PQ R" = (T/n) (P ,f QR + 
P Q" R). Next we divide by PQR, obtaining 

RT = T(PZ Q^\ (6) 

We now make the step characteristic of the method of separa- 
tion of variables: we observe that the function R" /R on the left 
of Eq. (6) is a function of t alone, the quantity on the right a 
function of x and y alone. The equation then states that a 
certain function of t equals a function of x and y, whatever x, y, 
and t may be. But this is clearly impossible in general. If, for 
example, we keep x and y constant, and vary t, the left side would 
change, the right remaining constant, and the equation would 
not be satisfied. The only exception, as this example shows, is if 
the left side is a constant, independent of t, and similarly if the 
right side is a constant, independent of x and y. Let us then 
impose these conditions, letting the constant be — o> 2 (an arbi- 
trary constant so far, but later to be identified with our other co). 
We have then two equations, 

R 9 

-R = -"• 

R" + o> 2 R = 0, 

$? + ©—* 


Taking the latter equation, we may again separate. We write it 

T ~ -Q *T (8) 

The left side is a function of x, the right side of y, and by the same 
argument each is a constant, say — k 2 . Then we have P" + 
k 2 P = 0, and -k 2 = -(Q"/Q) - <» 2 {ii/T). We can rewrite 
this last Q"/Q = — I 2 , where 

-I 2 = k 2 - co 2 £, or co 2 = -{k 2 + I 2 ), (9) 

and it becomes Q" + l 2 Q = 0. We now have three ordinary 
differential equations for P, Q, and R, whose solutions are 

p = e ikx (or e~ ikx , or sin kx, or cos fcx), 
Q = e ilv , R = e**, 

so that the final solution is as we found before, with the same 
relation between w, k, and I. 

99. The Circular Membrane. — The differential equation for 
the circular membrane is the same as for the rectangular one, but 
the boundary condition is different : the displacement u is always 
zero on a circle of radius p about the origin. To solve the prob- 
lem, the simplest method is to introduce polar coordinates, r, 
6; for then the boundary condition is that u = when r = p, 
which is a condition easy to apply. Let us then write our equa- 
tion in polar coordinates. Before doing it, we shall give the 
conventional names of the equations and symbols we meet. Our 
equation, which is often written 

d 2 u d 2 u _ 1 &u-_ n ,- n v 

Ix~ 2 + dy 2 v 2 dt 2 ~ U ' ' U; 

where v — y/T/p. is the velocity of the wave, is called the wave 
equation, for u represents waves, either progressive or standing. 
The special case d 2 u/dx 2 + d 2 u/dy 2 = 0, where u is independent 
of t, is called Laplace's equation. And the expression d 2 u/dx 2 + 
d 2 u/dy 2 , which we have already seen can be written in vector 
notation V 2 w, is called the Laplacian of u. Our present problem 
is to find the Laplacian in polar coordinates. 

100. The Laplacian in Polar Coordinates. — Let us introduce 
r and by the equations x = r cos 6, y = r sin 0, r = \A 2 + V 2 > 


, v ' . A 5r a; 5r y a# — ydB x . 

6 = tan" 1 -> so that -r- = -> -x- = -» x- = — ^> ^- = -5* and 
a; dx r dy r ox r L dy r z 

d 2 r _1 _x* d^r_l_y^fl = ^y j™ = ~2sy . 
~dx 2 ~ r r^ , 'dy 2 ~ : r r 3 ' dx 2 ~ r 4 ' 3?/ 2 r 4 

Then we 


5m _ du dr du a# 
~dx ~~ ~dr ~dx + ~BB ~dx" 

If we apply this process again, we find without difficulty 

dhi _ dhi(dr\\ q d 2 u ( dr dd\ d 2 u( dd\ 2 dudh du d 2 6 
dx 2 ~ lJr~ 2 \~dx) + drdd\dx dx) + dd 2 \dx) + dr dx 2 + dd dx 2 ' 

Proceeding similarly with y, and adding, we have 

^u _,d^u = dMYdA 2 , / 3rV] , o^( — — -j- — — ^ + 
dx 2 + 3y 2 ar 2 L\ax/ + VW J drdd\dx dx "•" dy dy/ 

VwyLW/ w/ J ar\az 2 + ay 2 / + aava* 2 + dy 2 / 

Substituting, this becomes 

d 2 u 1 d 2 u 1 du 
dr 2 7 2 ~dT 2 r~dr' 

which can also be written 

i a/ aA 4- ^— fiu 

r dr\ dr ) + r 2 dd 2 ' U; 

This is the expression for the Laplacian in polar coordinates. 
101. Solution of the Differential Equation by Separation. — 
Our differential equation is now 

IJL( 0?±\ _lA §^a = Af^f (\<x\ 

r dr\ dr) + r 2 dd 2 v 2 dt 2 ' K } 

Let us solve by separation of variables, assuming u = 
R{r)Q(d)T{t). Then, substituting, and dividing by ROT, the 
result is 

ll^/^\^ld 2 = ll L d 2 r n 

R r dr\ dr) + r 2 6 dd 2 v 2 T dt 2 ' K } 

The problem is separated : the left side depends only on r and 0, the 
right on t. Each must then be a constant, which we shall call 
-a> 2 /v 2 , giving d 2 T/dt 2 + co 2 T = 0, T = A cos a + B sin a, 



1 1 d( dR\ , 1 ld 2 

/ diA J. 1 
V dr/ + r 2 9 

R rdr\ r 'dr/ ' r 2 6# 

We multiply by r 2 , and transfer the first term to the right, obtain- 

ld 2 9 


lRrdr\ dr J ^ v 2 \ 

Again the variables are separated, the left side depending only on 
0, the right on r. Let each equal — ra 2 . Then d 2 Q/dd 2 -+- m 2 Q — 
0, = C cos md + D sin md, and the equation for r can be 
immediately changed to 

This is just like BessePs equation (see Prob. 13, Chap. II), except 
that it has the constant co 2 /?; 2 in place of 1. A simple change 
of variables removes this discrepancy, however. Let x = ar/v. 
Then the equation becomes 

x dx\ 

or cancelling <a 2 /v 2 , it is exactly BesseFs equation 

i=('© + ( 1 -S>- - (15) 

The solution is then R = constant X J m (x), a Bessel's function 
of the wth order, whose expansion in power series we have already 
considered, for integral values of w, and for which we have found 
an approximation in the preceding chapter (Chap. XIV, Prob. 8). 
We shall see in the next section that only integral ra's must be 
used in the present problem. 

102. Boundary Conditions. — Consider in the first place the 
solution for 0. At a given point of the membrane, the value of 6 
is determined, but not in a single-valued way. Thus if the point 
corresponds to 6 = 47 deg., it would equally well correspond to 
47 deg. + 360 deg., or 47 deg. + 720 deg., etc. Now 9 must 
surely have a definite value at each point of the membrane. Thus 
it must have the same value for 0, + 2t, + 4rr, etc. In 
other words, 9 is periodic in 6 with period 2ir. But this is true 



if, and only if, m is an integer. Hence our first condition, neces- 
sary to make the function single valued, is that m be an integer. 
Next consider the solution for r: R = J m (i»r/v), where now 
m is an integer. At the edge of the membrane, u = 0, which 
means that R = 0, or J m (a>p/v) = 0. Now J m (x) is zero only 
for certain definite values of x, say x = x\, x 2 , Xz, • • * . From 
the properties of Bessel's functions, we have seen that there 
are an infinite number of such roots. Thus, to satisfy our 
boundary conditions, we must let up/v = xi, x 2 , • • • . The 
only adjustable quantity is <a, so that it must be determined 

m = 0,K=0 


m»2, K=0 


22. — Nodes of 

m»1,K=1 m-1,K*2 

circular membrane. Shaded segments are displaced in 
opposite phase to unshaded. 

by one or another of the values w == vxi/p, ra 2 /p, • • • . Sup- 
pose in particular that w = v Xk/p, determined by the Mh root 
of J m . Then we should properly label it o} mk , since it depends 
on both these indices. We have thus determined our solution 
completely, except for the remaining arbitrary constants. These 
can be easily expressed in the following form : 

u = (A cos oimid + B sin u mk t) cos (md — a mk ) J m (w m kr/v). 

This is a particular solution. The general solutionis the sum 
of such terms, taken over all m's and A;'s. 

103. Physical Nature of the Solution. — A single term corre- 
sponds to a single standing wave. Its nodes are concentric 
circles, values of r for which J m (oo mk r/v) is zero, of which of 
course the boundary is one; and radii, determined by cos (md — 
a) = 0, as in Fig. 22. It is readily seen that there are m radial 


nodes, k circular nodes without counting the boundary. The 
arbitrary constant a mk determines the angles at which the radial 
nodes are; changing it simply rotates the whole nodal pattern. 
The constants A and B determine the~amplitude and phase 
of the disturbance as a function of time. We may, if we choose, 
consider that there are two separate waves possible for each 
frequency, cos md J m and sin md J m . Such a case is called degener- 
ate; we shall see in a problem that the same thing is true of the 
square membrane. In a degenerate case, with two or more 
possible vibrations of the same frequency, it is plain that any 
linear combination of these vibrations gives a possible vibration 
of this same frequency. As with the rectangular membrane, 
the set of frequencies co m& does not form a simple set of overtones 
with pitches in harmonic relation to each other. 

104. Initial Condition at t = 0. — Suppose we know that at 
t = 0, the displacement of the membrane is given by F(r, 0), 
and the velocity by G(r, 0). Now we can write the whole solu- 
tion, in a slightly more general way than before, 

u = ^[(Amk cos o> mk t + B m k sin oi mk t) cos md + 

m, k 

(C mk cos oi mk t + D mk sin o, mk t) sin md]J m y^—J. (16) 
Thus, writing displacement and velocity at t = 0, we have 

F(r, 0) = 2^ mit cos m0 + Cmk sin md ) J Arir) 

m, k 

G(r, 6) = 2 W ^ B ^ cos md + Dmk sin md ) J Arf~) 

m, k 

The A' a, B's, C's, D's must be chosen to fit these conditions. 
Both conditions are of the same sort. They require us to find 
the coefficients for expanding a function of r and 6 in series of 
products of sines and cosines and Bessel's functions. Now it 
proves to be true that both the sines or cosines and the Bessel's 
functions are orthogonal, and as a result of this we can make the 
expansions we desire in the usual way, as with Fourier series. 
Let us take the first equation, multiply by cos nd Jniunir/v), and 
integrate over the area of the drum. That is, we integrate with 
respect to r from to p, and with respect to 6 from to 2ir, 
and the element of area is rdrdd. Then we have 


f " f V(r, 0) cos n0 JnO^f) r dr dd 

= V f '(A** cos ra0 -f C mfc sin md) cos n0 d0 


By the orthogonal property of the sine and cosine, the right side 
is zero unless m = n, giving y-p A nk \ r J n [ -^— J JJ —- J dr. 


But now we shall prove in the next section that the J's are 
orthogonal in the sense that I r J n ( - !L - ) J n ( - !L - J = 0, if k ?£ I. 
Using this fact, our sum reduces to the single term 

t Ani I r J„ 2 (— J dr. 

If the last integral, which could be easily computed if we knew 
the properties of Bessel's functions better, were denoted by c n i, 
then we should have 


A nl = —\ F(r, 0) cos nd J n [ — ) r dr dd, 

TTCnlJo Jo 



determining the coefficients A in terms of a single integral. 
Similarly we could get formulas for the B's, C's, D's. Of course, 
in an actual case, these integrals might be very difficult to com- 
pute, but nevertheless we have a general solution of our problem. 
. 105. Proof of Orthogonality of the J's. — We can prove the 
orthogonality of the J's directly from the differential equation, 
as was done in the last chapter for the nonuniform vibrating 
string. We wish to prove that 

Now we have 

r dr 

r dr 

dJ n (<Jnir/v) 

dJ n (ca» k r/v) 

(ani 2 n 2 \ T (u>nir\ _ n 
/co nk 2 n 2 \ T /w»ifcA _ _ 


Multiply the first by r J n (o} nk r/v) } the second by r J»(w»jr/t/), 
subtract, and integrate from to p. The result is 

rw^)*[' fis ^]- j <^)4' ss ^]}* 

(«nfc 2 — 0>nl 2 \ C T ( w ntf*\ T ( 0) nk r\ , 
—ir—)jo rJ i-T) J \ir) dr - 

Just as in the last chapter, the left side can be shown to be zero, 
by integrating by parts. Then the right side must be zero, 
and either u nk 2 — w n j 2 is zero, which is not true unless k and I 

refer to the same overtone, or \\ J n (o} n ir/v) J n (oo nk r/v) = 0, 

which we wished to prove. The orthogonality is not quite 
of the form discussed in the last chapter, for the differential 
equation is of slightly different form, the quantity (a> 2 /v 2 — 
n 2 /r 2 ) r appearing in place of co 2 /x, so that the final result is not 
just like integrating n times the product of the functions to get 


1. A rectangular drum is 20 by 40 cm., its whole mass is 100 gm., the 
total pull on the faces 50 and 100 kg., respectively. Find the frequencies, 
in cycles per second, of the five lowest modes of vibration, and sketch the 
nodes for each. 

2. The special case of degeneracy arises when a rectangular membrane is 
square. Then the two modes of vibration e iat sin (rnrx/X) sin (rrnry/X) and 
e tot sin {mirx/X) sin (niry/X) have the same frequency (where we let X = Y). 
Thus any linear combination of these is a solution, again with this frequency. 
Consider the combinations 

e lC0t l A sin -^r- sin -~ + B sin -^r- sin -^? 1. 

Work out the nodes in the case n = 1, m = 2, for (1) B = A; (2) B = -A; 
(3) B = 2A. 

3. A rectangular membrane is struck at its center, starting from rest, in 
such a way that at t = a small rectangular region about the center may be 
considered to have a velocity v, and the rest has no velocity. Find the 
amplitudes of the various overtones. 

4. Imagine n and m plotted as two rectangular coordinates. Show that 
a curve of constant «, plotted in these coordinates, is an ellipse. Each 
integral value of n and m corresponds to an overtone, so that if we draw the 
point corresponding to each overtone, the number of points within such an 
ellipse gives the number of overtones with angular velocity less than «. 
Note that the number of such points per unit area of the plane is just one, 
and so find an approximate formula, using the area of the ellipse, for the 
number of overtones of frequency less than w, and also for the number 


between o> and u> + do>. Check up this approximation by the exact values 
of Prob. 1. 

6. In the circular membrane, suppose that m = 0, and that k is very large, 
so that there are many circular nodes. Consider a small region near the 
edge of the membrane. The few nodes in this neighborhood will be almost 
straight lines, as if we were near the edge of a rectangular membrane. Find 
the asymptotic wave length, using the fact that J m (x) approaches cos 
(x — a) at large x, and show that the wave length is connected with the 
velocity and frequency in the usual manner. 

6. Set up the wave equation in three-dimensional spherical coordinates, 
in which x = r sin cos <f>, y = r sin sin <f>, z = r cos 0. Show that it is 

±±( r *»»\ + 1 ± /"sin **\ + 
r 2 dr V dr ) ^ r 2 sin dO \ d9 J ^ 

d 2 u 1 dhi 

r 2 sin 2 d<t> 2 v 2 dt 2 ' 

7. Separate variables in the preceding equation. Show that the function 
of </> is sin m<t> or cos m<£, where m is an integer. Show that the equations 
for r and are respectively 

r 2 dr\ dr) ^ \v 2 r 2 ) ' 

where w, C are constants; 

1 d ( . a de\ . ( n m 2 \ . 

8. The equation for in Prob. 7 is called Legendre's equation. Let O = 
sin" 1 6 F(cos 0). Find the differential equation for F, solving in power series 
in cos 9, and show that the series breaks off if C = 1(1 + 1), where I is an 
integer. The resulting functions are called Fj m (cos 0), and are known 
as associated Legendre functions. Compute the first few Legehdre 

9. In the equation for r in Prob. 7, prove that R = ' + ^T% where x = tar /v. 

10. Prove that two functions u n and u m , satisfying differential equations 
of the form 

^[r(^]+ W ,W-/(#„ = o ( 

with different w n 's, but chosen so that both u n and u m are zero at x = and 
x = L, satisfy the orthogonality condition I n(x)u„(x)u m (x)dx = 0. 




In the preceding chapters, we have been treating the vibra- 
tions of elastic strings and membranes, one- and two-dimensional 
bodies, and now we pass to the three-dimensional case, or the 
elastic solid. Of course, the strings and membranes were really- 
elastic solids, of particular shapes. But there are several ways 
in which we must give a more general treatment than we have 
previously done. First, in the strings and membranes, the 
rigidity of the material itself was not great enough to affect 
ihe vibration, whereas in the problems we now take up this 
rigidity, or the elastic properties of the material in general, will 
be important. Thus we may imagine all gradations of the prob- 
lem of a stretched wire, from the limiting case of a very thin 
long wire under large tension, when our previous theory is 
applicable, down to a short thick bar under small tension or 
even with no tension at all, when the restoring force on a particle, 
far from coming from the tension on the ends, comes from the 
distortion of the bar itself. Secondly, with the strings and 
membranes, we considered only transverse vibrations, while 
here we discuss longitudinal vibrations as well. Of course, 
strings can vibrate longitudinally, but we have so far neglected 
this phase of their motion. Thirdly, a very important part 
of the problems of strings and membranes has arisen from 
the fact that they were limited in space, the membranes being 
very thin pieces of material,, the strings thin in two dimensions. 
But while some of the problems of the present chapter have this 
property, we shall also consider vibrations and waves in extended 
media going, in the limiting case, to infinity in all dimensions, 
as sound waves in an infinite gas or solid. It is these sound waves 
which show the best analogy to our one- and two-dimensional- 
wave equations. 

106. Stresses, Body and Surface Forces.— The first step in 
discussing the vibrations of an elastic solid, as with the string and 



membrane, is to find the force acting on an infinitesimal volume 
element, and to set this equal to mass times acceleration. The 
forces may be divided into two classes: (1) volume or body forces, 
such as gravity, which act on each volume element of the body, 
and which for the present we neglect, since we shall not use them 
in our applications; and (2) surface forces, with which neighboring 
parts of the medium act on each other, and which are transmitted 
across surfaces, or the forces transmitted across the bounding 
surface of the whole body. The tensions which we have met with 
string and membrane are examples of such forces, or pressures in 
a gas, or shearing forces in a twisted rod. To specify such a force, 
we imagine a surface element dA to be drawn somewhere in the 
body, with a normal n. The material on either side of dA exerts 
a force on the material on the other side; thus this force is a push 
normal to the surface if tbsre is a pressure in the body, it is a 
tension if that is the form of stress, or it may be a shearing force. 
The force exerted by the material on one side, on the material on 
the second side, and the other force exerted by the material on 
the second side back on the first side, are action and reaction, and 
are equal and opposite, so that one always has an ambiguity of 
sign in dealing with these forces, or as we call them stresses. We 
adopt the following convention : We imagine dA to be part of the 
surface bounding a volume, and n to be the outer normal. Then 
the force we deal with is the force exerted by the outside on the 
material inside the volume, over dA. Now this force will be a 
vector, and proportional to dA ; we call its x, y, and z components 
X n dA, Y„dA, Z n dA, respectively. The capital letters indicate 
the force components, and the subscript n denotes not a com- 
ponent but the direction of the surface normal. 

The properties of a stress can be completely specified if we 
choose three unit areas at a point, one normal to each of the 
three coordinate axes, and give the components of the force acting 
across each. Thus for the surfaces normal to the x, y, and z axes, 
we have the three force vectors, or nine quantities, 

x x 

Y x 

z x 




x z 

Y z 

z z . 


We see in Fig. 23 the significance of the three components X x , 
Y x , Z x . This set of nine quantities forms the so-called stresi 
tensor. The diagonal terms of the array, X x , Y y , Z z , are called 


the normal stresses or pressures, since the force components act 
normal to the surface, and the remaining terms are called shearing 
or tangential stresses. It is easily shown that the force across an 
arbitrary surface which has direction cosines I, m, n for its normal 
has an x component IX x + mX y + nX z , with corresponding 
formulas for the other components. 

X»dy dz ^ 



Fig. 23. — Components of force acting across dydz. 

107. Examples of Stresses. — The simplest stress is probably 
the hydrostatic pressure. There the force acting across a square 
centimeter is always at right angles to the area, and its magnitude 
is by definition the pressure P. The force acts into the body, 
and hence is of negative sign. We thus have X x = Y y = Z z = 
— P, all other components =0. A second example is a tension, 
say in the x direction. Then the unit area perpendicular to x 
has a force T exerted across it, normal to the area, but there is no 
force exerted across faces perpendicular to y or z. In other words, 
X x = T, all other components of the stress are zero. A third 
example is a shear. In Fig. 24 a, we have a cube of material, 
with equal and opposite tangential forces exerted across the 
faces normal to x, the forces acting in the y direction. Over the 
right face, the force exerted on the material is in the —y direction, 
so that for this face we have Y x = — S, a constant, and X x = 
Z x = 0. Over the opposite face, both force and direction of 
normal are reversed, so that the stress components are unchanged. 
But now we notice an important feature of shearing stress: the 
two forces we have mentioned exert a torque or couple on the 
cube, and if they were the only forces acting, it could not be in 
equilibrium. To get equilibrium, it proves necessary to have at 
the same time tangential forces exerted across the faces per- 
pendicular to the y axis, as in Fig. 246. These forces are equal 
in magnitude to the other, so that the torques obviously balance, 
and we have X v = Y x = —S, all other components equal to 


zero. This property, that X y = Y x , proves to be general: the 
stress tensor is symmetrical about its diagonal. 

By making a proper rotation of axes, it is always possible to 
reduce a stress to diagonal form, in which no shearing stresses 
appear. Thus, in the case we have just considered, the problem 
is obviously symmetrical about the diagonal of the cube. In 
Fig. 24c, we take a surface element whose normal has direction 
cosines I = — 1/a/2, rn = 1/V%, n = 0, giving a force exerted 
across it of components —S/y/2, S/\/2, 0, or a force of magni- 
tude £ normal to the surface. Similarly in Fig. 24d, we have a 





Y x =-S s 



Y x 

-*= — 

x^-s \ 

Y x =-S 



f Yx=-S 





(b) (c) 

Fig. 24.— Diagram of shearing stress, 
(a) Shear over the faces perpendicular to the x axis. 

(6) Additional shear over faces perpendicular to the y axis, necessary to 
balance the turning moment of the shear indicated in (a). 

(c) and (d) Stress system of (b) referred to principal axes, tension in (c) , pres- 
sure in (d). 

surface at right angles, and find again a force normal to the sur- 
face, but now of magnitude — S, Thus, if we take as new axes 
the two 45-deg. diagonals in the xy plane, and the z axis, the 
stress consists of a tension S along one axis, negative tension (or 
pressure-like force) at right angles, and zero stress across the 
face normal to z. Axes of this sort, in which each face has a pure 
pressure- or tension-like force across it, and no shear, are called 
principal axes of stress. * 

108. The Equation of Motion. — Let us find the force on a small 
element of volume, having sides dx, dy, dz. Over the face at 
x + dx, there will be a force X x (x + dx), Y x (x + dx), Z x (x + dx) 
per unit area. Similarly exerted over the face at x there will be 
a force —X x (x), —Y x (x), —Z x (x). The x component of the 


resulting force is X x (x + dx) — X x (x) = -Q-^dx per unit area, 




or -—dxdydz for the area dydz. The y and z components are 

~~dxdydz and -—-dxdydz, respectively. In the same way we 

can find the three components of force exerted over each of the 
two other pairs of faces. Adding, we have for the total x com- 
ponent of force \~^ + ~? + ^Jd x dydz. Thus, if v x , v y , v z 

are the components of velocity of the solid at the point in ques- 
tion, the equations of motion, remembering that the mass of our 
small volume is pdxdydz, are 

dX x dX y dX z __ dv x 

dx dy dz ^tt 

dY x dYy .BY, = dvy 

dx + dy + dz p dt 
dZ x dZy dZz _ dp? 

dx ^ dy ^ dz ~ p dt' W 

These equations are evidently simply the generalization of those 
used previously with the string and membrane. Thus with the 
membrane let the z axis be normal to the plane of the membrane. 
We consider then only the third equation, giving velocity along 
z. The. stress is a tension along the membrane, and if we cut the 
membrane with a surface perpendicular to x, we see that, if 
the membrane is inclined so that it makes an angle a with the x 
axis, there will be a component Z x , a force in the direction to 
produce acceleration, equal to Ta. If then a = du/dx, where 
u is the displacement along z, the first term becomes T(d 2 u/dx 2 ), 
as we found before. Similarly the second term is T(d 2 u/dy 2 ), 
and the third is zero, yielding the equation of vibration which 
we have already used. 

109. Transverse Waves. — Two sorts of waves are possible in 
an elastic solid: transverse waves, in which the displacement is 
at right angles to the direction of propagation of the waves, and 
longitudinal waves, as the sound waves in a gas, in which the 
displacement is in the direction of propagation. We consider 
first transverse waves. * Rather than taking the general case, 
which involves rather complicated formulas, we assume that our 
wave is being propagated along the x axis, and that the displace- 
ment of the particles is in the y direction. We shall expect to 
g ft a wave equation involving only x derivatives, not y or z, and 


having as solutions either progressive or standing waves. Let 
the displacement of a particle in the y direction be rj; since the 
wave is being propagated along the x axis, we assume that it has 
wave fronts normal to x, such that every point on a wave front 
has the same displacement, and this means that 77 is a function of 
x only. We may then consider a thin sheet or lamina, as that 
between x and x + dx in Fig. 25. Let us suppose that the two 
points which in the unstrained medium were at x and x + dx, 
y = 0, are displaced to the points P and P', at distances 77O) and 
-nix + dx), respectively, from the axis. Then evidently the 
lamina has been sheared, and we must find the relation between 
the shearing stress and the strain (that is, displacement) which 

Fig. 25. — Shear in a transverse plane wave. 

it has produced. The type of stress is evidently the sort described 
in Fig. 24. The material to the right of x + dx exerts across unit 
cross section of the face a force in the y direction, equal to Y x 
(or X y ) . But now Hooke's law says that the actual deformation 
of the material, or the strain, is proportional to the stress acting. 
In this particular case, the deformation is a shearing one, and is 
opposed by the rigidity of the medium (which is the reason why 
a liquid, having no rigidity, cannot have transverse waves). 
The deformation is given in terms of the coefficient of rigidity n 
as follows : the strain, measured by the tangent of the angle which 
the line PP' makes with the x axis, is equal to the shearing stress 
divided by /x- In other words, Y x = n drj/dx. Substituting 
this relation between stress and strain in the equations of motion, 
we have at once 

d( dr\ dv v 

TxVTx) = P W 



or, writing v y = drj/dt, 

d 2 7) 

dx 2 

P d 2 V 
H dt 2 ' 


the one-dimensional wave equation, representing transverse 
waves propagated with the velocity -y/vjp, or the square root of 
elastic modulus divided by density. Of course, we should have 
got the three-dimensional wave equation if we had considered 
propagation in an arbitrary direction. 

110. Longitudinal Waves. — Here again we consider propaga- 
tion along the x direction. In Fig. 26, let the displacement of a 
particle in the x direction be £(x), a function of x only. Evidently 
the stress in this case is a pure tension, positive or negative, so 



Fig. 26. — Compression and rarefaction in a longitudinal plane wave. 

that the force across unit cross section is a pull in the x direction, 
equal to X x . Hooke's law now states that the tension is propor- 
tional to the strain; and in particular, that it is proportional to 
the change in thickness of the lamina [which is evidently £ (x + 
dx) — £(x)] divided by the thickness. The constant of propor- 
tionality in this case is not one of the simple elastic constants; it 
proves to be written (X + 2/x), where X is an elastic constant 
whose physical meaning is not easy to state. Perhaps as good an 
interpretation of X as any is simply to define it from this particu- 

lar sort of deformation. We now have X x = (X + 2/z) — , all 

other components of stress = 0, so that from the equations of 
motion we at once have 


dx 2 X + 2 M dt 2 K J 

again a wave equation, representing a longitudinal wave traveling 
with velocity -\Z(\ + 2/z)/p, different from the velocity of the 
transverse wave. 

111. General Wave Propagation. — In the two preceding sec- 
tions, we have derived two very specialized waves which can 
be propagated in an elastic solid, plane longitudinal and trans- 
verse waves traveling along the x axis. Of course, much more 
complicated waves are possible, and if we were discussing the 
problem completely, we should set up the three-dimensional 

1 d 2 u 
wave equation, of the form V 2 u = -$ -^-j and derive general 

wave solutions. We should have separate equations for the 

longitudinal and transverse 

waves, generalizations of Eqs. 

(3) and (4). As we shall learn 

later when discussing optical 

problems, such a wave equation 

has as solutions not merely 

plane waves traveling in all 

arbitrary directions, but also FlG - .l 7 ; -1 ^? 1 * tr j* ns 7 e / se wave ' 

u ■ ' with longitudinal reflected wave. 

spherical waves diverging from 

point sources, and many more complicated types of waves. All 
these are possible in an elastic solid. In our discussion of the 
plane waves, we separated the longitudinal and transverse waves 
entirely, allowing one type to exist without the other, but 
unfortunately in general this cannot be done. For instance, 
when a wave of one type is reflected from a surface, then 
unless the reflection is at normal incidence, longitudinal motion 
will generally be partly converted into transverse, and vice 
versa. In Fig. 27, we show diagrammatically how this could 
be, the transverse motion in the incident wave evidently being 
in such a direction as to be partly transformed into longitudinal 
motion in the reflected wave. For this reason, the complete 
treatment of the vibrations of an elastic solid is a very complicated 
problem. An example is found in geophysical problems, where 
one is interested in the propagation of earthquake waves through 
the earth. This case is made even more difficult by the fact 


that the elastic properties of the earth change as a function of 
depth, so that one must use solutions of the form we have dis- 
cussed in Chap. XIV, in connection with strings whose prop- 
erties depend on position. 

There is one application of the theory of the waves in an elastic 
solid which has at least historical interest. When it was dis- 
covered that light was a transverse wave motion, it was attempted 
to identify these waves with the transverse vibrations of an elastic 
solid, the ether. The general properties, and even some of the 
details, as the quantitative laws giving the fraction of light 
reflected and transmitted at a boundary, were correctly worked 
out, the reflection being treated by analogy with our discussion 
of reflection of waves in strings at a point of discontinuity of 
density, in Chap. XIV. But the difficulty, which could not 
be overcome, was that of eliminating the longitudinal waves, 
which certainly do not occur in optics, but which were inherent 
in the elastic solid theory. This difficulty does not occur in 
the present electromagnetic theory, where only transverse waves 
are allowed by the fundamental differential equations. This 
lack of longitudinal waves makes the problem of optical wave 
motion on the whole simpler than that of elastic waves. 

112. Strains and Hooke's Law. — In discussing transverse and 
longitudinal elastic waves, we had to introduce certain elastic 
constants, measuring the ratio between stress components, 
and certain quantities measuring the strain or deformation 
of the substance. The fact that these strains were proportional 
to the stresses is Hooke's law, the fundamental law of elasticity, 
holding for sufficiently small strains. It is now worth while to 
state the general relation between stress and strain, though we 
shall not go through the proof. 

To begin with, we imagine the body unstrained. Then in the 
process of deformation, we imagine that the particle originally 
at x, y, z has been displaced to a point x + £, y + -q, z + f • 
The three quantities £, ij, £ are functions of x, y, z, and are the 
three components of a vector. We meet, in other words, a 
vector (which we may call the displacement), which is a function 
of position. Such a vector field reminds us of a force field, as 
a gravitational- or electric-force field, where the force vector 
on Unit mass or charge, respectively, is a function of position. 
We shall meet such vector fields often in the future. Now, the 
displacement is not the same thing as the strain; the body might 


be displaced bodily, without involving any stress or strain at 
all. It is only when the displacement of one side of a small 
element of volume is different from the other, so that the element 
is distorted in size or shape, that we have a strain. In other 
words, the essential quantities in determining the strain are the 
derivatives of £, 77, f with respect to x, y, z. We have already 
seen two examples: with the shear in the transverse wave, the 
strain was b-q/bx, and in the compressional wave the strain was 
b%/bx. In the two cases mentioned, the stress was proportional 
to the corresponding partial derivative, and Hooke's law means 
that this is true in general, in the form that the components 
of stress are linear functions of the partial derivatives of the 
components of displacement. There are nine components of 
stress, of which six are independent (remembering that X y — Y x , 
etc.), and similarly there are nine partial derivatives of displace- 
ment, of which it can be proved that six again are independent. 
This would mean six linear equations, with thirty-six coefficients, 
which would act as elastic constants. In the most general type 
of substance, a completely anisotropic crystal, it can be shown 
that twenty-one of these really are independent, giving a tre- 
mendous number of elastic constants. With isotropic substances 
showing no crystalline structure, however, most of these con- 
stants are either zero or can be written in terms of each other, 
and there are only two independent constants, the X and /j. 
which we have already met ; all other elastic constants, as Young's 
modulus and the compressibility, can be written in terms of 
them. Using these constants, the relations between stress and 
strain prove to have the following form : 

x x 

-«, + x)fI + x* + x£ 


(b$ b V \ 


d£ bn df 

= x s + (2m + x)|| + x^ 


= \Tz + -by) 

z z 

= Hr + *tt + < 2 " + x )f 

bx by bz 

z x 

= \ai + Tz) 


In the cases we have taken up already, we have seen two illustra- 
tions of these equations: with transverse waves, bri/bx was the 
only partial derivative different from zero, and we had X v = 
n bri/bx; with the longitudinal wave, b£/bx was the only term 
different from zero, and as we see this gives X x = (2/x + X) b%/bx, 
as we had before, but also Y v = Z z = X b%/bx. These latter 


stress components, however, since they do not depend on y or z, 
do not contribute to the equations of motion, as we see by refer- 
ring back to these. 

113. Young's Modulus. — To illustrate the use of the equations 
connecting stress and strain, we shall discuss the stretching of 
a wire. ■ Let the wire be stretched along the x axis, and let the 
stress be a pure tension T, so that X x = T, and all other stress 
components are zero. The x, y, z axes are principal axes for 
this stress, and it can be shown that the strain has principal 
axes, too, parallel to those of stress, so that the last three equa- 
tions, for X y , etc., do not enter. We are left, then, with the three 

.°-^ + ^ + «> + »& 

Subtracting the thi&l from the second, we have dy/dy = d£/dz. 
Using this relation, either the second or third gives drj/dy = 

where a = t^t — ; — N > and is called Poisson's ratio. 
2(X + m) 

Since X and //. are always positive, it is obvious that Poisson's 
ratio is never greater than %. We have found, then, that as 
the wire is stretched (positive d%/dx), it contracts sidewise, 
(negative drj/dy and d^/dz) and the ratio of sidewise contraction 
per unit width, to lengthwise stretch per unit length, is given by 
Poisson's ratio. Actual materials have Poisson's ratio of the 
order of magnitude of %. Now we put this expression back 
in the first equation, obtaining T = (2/t + X — 2Xcr) d£/dx. 
The elastic modulus (2jt + X — 2X<r), giving the tension, or 
force per unit area, divided by the elongation per unit length, 
is called Young's modulus, and is denoted by E. In the prob- 
lems we find other ways of writing the relations between Young's 
modulus, Poisson's ratio, and the other elastic constants. 

It is worth noticing that Young's modulus was not the elastic 
constant which entered into the velocity of compressional waves. 
If we had longitudinal waves traveling down a wire, the wire 
would contract laterally at those points where it was under 
tension, expand when it was under compression, as given by 



Poisson's ratio, and for such a wave the velocity would be 
determined from Young's modulus. But in our extended 
medium, we did not allow the possibility of the lateral motion 
connected with such a contraction and expansion, since in a 
medium of large dimensions compared with the wave length 
this would amount to a very large transverse motion. We 
assumed instead that the motion was purely longitudinal, and 
found that we had to assume the existence of other lateral 
stresses, tensions Y y and Z z , to counteract the tendency to 
expansion and contraction. These stresses changed the condi- 
tions of the problem, and in particular the elastic modulus 
concerned in the velocity of propagation of the wave. 


1. In Fig. 28, let the normal to the inclined face of the prism have direc- 
tion cosines I, m, n. Compute the total forces exerted by an arbitrary stress 

Fig. 28. — Prism for computing force exerted by stresses across a face with 

arbitrary normal n. 

on the prism, and prove that the net force is zero, and the prism is in equilib- 
rium, only if the force per unit area over the face perpendicular to n has x 
component IX X + mX y + nX z , etc. 

2. Rotate coordinates to reduce an arbitrary stress to principal axes. 
Carry through the problem of the pure shear, discussed in Fig. 24, as an 
illustration of the general method. 

3. Prove that in terms of Young's modulus and Poisson's ratio we have 

E* _ E 

X = 71— ; wl S-V 2 M = 

(1 + <r)(l -2a) M 1 +» 

4. Assume a body is under pure hydrostatic pressure P. Show that the 

distortion is a decrease of all dimensions by a fixed fraction. Show that 

the fractional change in volume is d$/dx + dr,/dy + d?/dz. Using this, 

show that the compressibility k of a solid under hydrostatic pressure, which 



Fig. 29. — Bent beam. 

by definition is the fractional decrease of volume divided by the pressure, 
equals 3(1 - 2a) /E. 

5. Show that the velocity of a longitudinal wave in a fluid, for which p. is 
zero, is l/y/icp, where k is the compressibility. 

6. A rectangular beam held at one end is bent into an arc of a circle, 
the radius of curvature of its central section being R. Find the stress 
distribution throughout the beam, showing that the beam will be kept in 

equilibrium by a torque or couple of the sort 

K indicated. Show that for a given torque the 
curvature of the beam is inversely proportional 
to ab 3 E, where E is Young's modulus (seeFig.29). 
7. A circular cylinder of height h rests in 
equilibrium under the action of gravity. Take 
a coordinate system with the xy plane in the 
top base of the cylinder and the positive z axis 
pointing downward. Show that the only com- 
ponent of stress different from zero is Z z = —pgz, 
if p is the density of the cylinder. Using Hooke's 
law show that the strains are d£/dx = dv/dy 
= (<r/E)pgz, and df/dz = ~(1/E)pgz, and find 
the other partial derivatives. Integrate these expressions to find the com- 
ponents of the displacement of any point of the medium, remembering that 
the strains are partial derivatives. Show that a horizontal plane section 
of the cylinder becomes a paraboloid of rotation due to the deformation. 
Show that the radius of the cylinder increases from top to bottom when it 
is thus deformed. 

8. A spherical shell of inner radius Ri, outer radius R 2 , contains a fluid 
of pressure Pi, and is immersed in a second fluid of lower pressure P 2 . It 
can be shown that the displacements of points on account of the pressure 
are given by $ = x(A + B/r 3 ), v = y(A + B/r 3 ), f = z(A + B/r 3 ). 
Verify these values by computing the stresses at any point, substituting in 
the equations of motion, and showing that they result in equilibrium. Show 
further that the force across an area normal to the radius is itself normal to 
the surface, so that the stress within the sphere can be balanced by hydro- 
static pressures within and without. 

9. In the shell of Prob. 8, determine A and B so that the pressure will 
have the proper values at Ri and P 2 . Discuss the stress within the shell, 
showing that the principal axes at any point are along the radius and two 
arbitrary directions at right angles, and find the tension or pressure along 
the directions at right angles, discussing the final result physically, with 
special reference to possible breaking of the shell under excessive pressure 



In the last chapter we discussed the equation of motion of an 
elastic body where there was no mass motion or flow. Now we 
pass to hydrodynamics and the flow of fluids. Much of what we 
say, however, applies to flow in general— such as heat flow, which 
we shall take up in the next chapter— and even to such a different 
subject as electrostatics. The feature in common in all these prob- 
lems is the existence of a vector field. By that we mean a vector 
defined at each point of space. We have already met such a field 
in our general discussion of forces and potentials in Chap. VI, 
for the force is defined at every point of space and forms a vector 
field. In the present case the vector 
is the velocity of the flowing fluid, 
or the closely related flux density. 
With heat flow it is again a flux 
density for the flowing heat, and for 
electricity the electric field. All 
these problems, though so different 
physically, are thus mathematically 
similar and can be treated by the 
same analytical methods. 

114. Velocity, Flux Density, and 
Lines of Flow. — At every point of a 
flowing medium, we can define the velocity, a vector (the 
time rate of change of the displacement, which we used in the 
last chapter, and to which we assigned components £, 17, £)• Also 
we can give the density p, and both p and v are in general func- 
tions of position (x, y, z) and of time. We may now ask, How 
much material will flow across any area per second? This 
total flow across a surface is called a flux. In Fig. 30, we con- 
sider an infinitesimal surface element dS. With dS as a base we 
erect a prism, the slant height being the velocity v, which in 
general is not normal to dS. Evidently the material in the 
prism will just be that which crosses dS in one second, since in 


Fig. 30. — Flux through an area 


this time it will move a distance v, and fill the dotted prism. But 
this is p (the density) times the volume of the prism (the base dS 
times the altitude t/», where n is the normal to the surface), or 
pv„dS. The quantity pv is called the flux density, and we may 
denote it by /. Then for a finite area, the total flux will be the 
sum of the contributions from all the surface elements, or a sur- 
face integral fff n dS = jjpv n dS. In some kinds of flow, such as 
heat flow, there is an analogue to the flux, but not to the density 
and velocity separately, so that one regards the flux density as 
being the more fundamental vector field. 

We can draw lines through the medium, tangent at every point 
to the direction of flow at that point. These are called lines of 
flow. Similarly we can set up tubes of flow, the elements of their 
surfaces being lines of flow. We can imagine the substance to 
flow through these tubes, as water flows through a pipe, never 
passing outside, since the velocity is always tangential to the 
surface of the tube. In hydrodynamics these lines of flow are 
called streamlines, and the sort of flow in which they are inde- 
pendent of time is called streamline flow. 

115. The Equation of Continuity. — Consider a fixed volume 
in a flowing fluid. The amount of fluid in the volume is fffpdv, 
and this can change in two ways. First, liquid can flow into the 
volume over the surfaces. Secondly, it may be possible for 
liquid to be produced within the volume without having flowed 
in. For instance, in a swimming pool, for all practical purposes 
we may consider the opening of the inlet pipe as a region where 
fluid is appearing, and the outlet as a place where it is disappear- 
ing. Such regions are called sources and sinks, respectively. 
Then we have 


-rj-dv = rate of inflow over the surface + 

rate of production inside. 

Now we have just seen that the rate of flow over any surface, 
or flux, is JjfndS. This represents outflow if n is the outer nor- 
mal to a closed surface, so that we must change sign to get inflow. 
If in addition we assume that the rate of production of material 
per unit volume is P, we have 


the volume integrals being over the whole region we are consider- 
ing, the surface integral over the surface enclosing this volume. 
If we now apply our equation to an infinitesimal volume in the 
form of a rectangular parallelopiped, bounded by x, x + dx, 
y,y + dy, z,z + dz, we can put the equation in a form not involv- 
ing integrals. The flow to the right (into the volume) over the 
face x is f x (x)dydz. The net flow over that at x + dx is 

f x (x + dx)dydz = f x (x)dydz + —f x (x)dydz • • • . 

Thus the total inflow over the faces is — —(f x )dxdydz. Adding 


similar contributions from the other faces we have for the total 


-jjus = -(If, + ±f. + !/.)*»** - 

— (v • f)dv — — div / dv, 

where the divergence is a vector operator discussed in Chap. VI. 

f£=-div/ + P. (1) 

This is often called the equation of continuity. We may note 
several special cases. If there is no production of fluid in dv, it 

^ + div/ = 0, 

or using / = pv, 

^ + div (pv) = 0. (2) 

Again, in a steady state, where density is independent of time, 

div/ -P. (3) 

This equation shows the physical meaning of the divergence of a 
vector: it measures the rate of production of the flowing sub- 
stance, per unit volume. Finally, if no substance is being pro- 
duced at the point in question, and density is independent of 
time, div / = 0, and we have a divergenceless flow. 

116. Gauss's Theorem. — We have proved that the amount of 
substance flowing out of a small volume dxdydz = dv per second 
equals div / dv in steady flow. Suppose now that we have a large 


volume and that we wish to find the total amount flowing out of 
it per second. This is simply the sum of the amounts flowing 
from each element. Thus it is a volume integral, ///div / dv. 
On the other hand, the material all flows through the surface, so 
that the rate of outflow is Jjf n dS. These two expressions must 
be equal: 

JJJdivfdv = SJfndS. (4) 

This is Gauss's theorem, and it- holds for any vector / which is a 
function of position. 

117. Lines of Flow to Measure Rate of Flow. — Let us set up a 
definite number of lines of flow, so that the number crossing a 
unit area perpendicular to the flow is numerically equal to the 
magnitude of the flux density. We could surely do this, but we 
might have the necessity of sometimes letting lines start or stop, 
to keep the right number. We can prove, however, that with a 
divergenceless flow this would not be necessary. The lines start 
or stop only at places where the divergence is different from zero : 
that is, they start at sources, stop at sinks. For an elementary 
proof, let us take a short section of a tube of flow, bounded by 
two surfaces normal to the flow. Let one of them have an area 
Ai, the other A 2 , and let the magnitude of the flux over the one 
face be f h over the other / 2 . Then the total current in over one 
face isfiAi, and out over the other is f 2 A 2 . If the flow is diver- 
genceless, these are equal. But the number of lines per unit area 
on the first is f h so that the number cutting the one end of the 
tube is fiAi, and the number emerging at the other end is/ 2 A 2 . 
Since these are equal, no lines are lost or start within. In other 
words, in a divergenceless flow, lines never start or stop except at 
sources or sinks. For a more general proof we note that the 
number of lines crossing a surface element dS, by definition, 
is f n dS. Then the number emerging from a closed surface, and 
which therefore have started within the surface, is ///„ dS. But 
by Gauss's theorem this is J//div / dv, and is zero if the flow is 

118. Irrotational Flow and the Velocity Potential. — In Chap. 
VI we studied vector fields like our flux vector; we were interested 
then in forces. We saw that under certain conditions, a force 
could be written as a gradient of a potential function. The 
condition was that the work done in taking a particle around any 
closed path should be zero, or that the field should be conserva- 



tive: JF • ds = around any contour. We had another way 
of stating the condition: it was curl F = everywhere. In a 
similar way, if the curl of our velocity vector is zero, we can 
introduce a potential function here. It is now to be regarded as 
a purely mathematical device, used simply by analogy with our 
previous cases, and having nothing to do with potential energy. 
A flow whose curl is everywhere zero is called an irrotational flow. 
It is easy to prove that in a whirlpool the curl is different from 

Fig. 31. — Lines of flow and equipotentials for flow about a cylinder. Full 
lines indicate lines of flow, dotted lines equipotentials. In a corresponding 
electrical problem with charges distributed within the cylinder, and placed in a 
uniform external electric field, the dotted lines would be lines of force, full lines 

zero (see for instance Prob. 4, Chap. VI), a nonvanishing curl 
indicating in fact exactly a whirlpool. Now, physically, we are 
acquainted with two sorts of fluid flow: streamline flow and 
turbulent flow. In the latter, eddies or whirlpools form, and the 
curl of the velocity is not zero. But in the former, there are no 
eddies, the curl of the velocity is zero, and the flow is irrotational. 
In a streamline flow, then, we can introduce a potential function, 
called the velocity potential <f>, defined byv = — grad <t>. The 
velocity potential, of course, is not a potential energy; its analogy 
with potential energies is mathematical rather then physical. 
Nevertheless, we can draw surfaces of constant velocity potential, 


or equipotentials, and the lines of flow will cut the equipotentials 
at right angles. Using the equation of continuity, and assuming 
that p is constant, we have as the general equation for the velocity 

div (pv) = -p div grad 4> = -pv 2 4> = — ■£ + P. (4) 

reducing to Laplace's equation v 2 <£ — for a steady state where 
there are no sources or sinks. 

The introduction of a velocity potential satisfying Laplace's 
equation makes it possible in many cases to solve hydrodynamic 
problems by analogy with similar problems in other branches of 
physics, as electrostatics. In Chap. XIX we shall find that the 
electrostatic potential satisfies Laplace's equation, the lines of 
force being normal to the equipotentials, so that any set of electro- 
static equipotentials can be used for a suitable hydrodynamic 
problem. For instance, in Fig. 31, we show the lines of flow and 
equipotentials for flow of a liquid about a cylinder. The same 
lines, however, represent lines of force resulting from a certain 
distriBution of charges in the center of the sphere, superposed on 
a uniform electric field. 

119. Euler's Equations of Motion for Ideal Fluids. — The equa- 
tion of continuity serves to determine the velocity of flow of a 
liquid, but does not determine the pressures, or make any 
connection with forces. It is essentially a kinematical rather 
than a dynamical law. It is one of two fundamental equations 
governing fluid motion. The other is essentially the Newtonian 
law, force equals mass times acceleration. For a continuous 
medium, we have already seen how this is to be formulated in 
the preceding chapter, where we wrote the force on an element 
of volume in terms of the stresses. As was mentioned in the last 
chapter, an ideal fluid is characterized by the fact that it supports 
no shear and hence n = 0. For this case the six stress compo- 
nents reduce to one, namely X x = Y v = Z z = — p and X y = 
Y z = Z x = 0, if p denotes the pressure in the fluid. Further- 
more, if there is flow of the fluid one must consider the velocity 
of each particle as a function of x, y, z, and t, and hence 

dv x dv x . dv x , dv x , dv x 
-dt = -di+ V *-dx- + V «ly- + v *to 

and two similar expressions for v y and v s . Written in vector 
form with the help of our symbolic vector V = grad 


if " IF + (c ' v) "" " IF + ( " • grad)o " 

i.e., we form the scalar product of v and V and then operate 
on v x . Our general equations of motion become in this case : 

*-%- 4% +<■-***•} 

where X, Y, Z represent the body force (as gravitation) per 
unit mass, which we neglected in the last chapter. Combined 
into one vector equation this gives 

F grad p = — + (v • grad>, (5) 

p at 

where F is the body force. These are the Euler equations of 
hydrodynamics. In them p (the density) is considered a known 
function of the pressure as given by the equation of state of 
the substance. We then have p, v x , v y , v z as functions of x, y, z, 
and t. The three equations above and the continuity equation 
provide the necessary four equations to give a unique solution. 
For the case of hydrostatic equilibrium, these equations reduce 
to the form F = (1/p) grad p, from which such familiar things as 
Archimedes' principle immediately follow. 

120. Irrotational Flow and Bernoulli's Equation. — If there is 
irrotational flow, and the velocity is derived from a velocity 
potential, Euler's equations take a particularly simple form. 
If v = — grad <t>, then we have 

(v • grad)^ = —(v- grad —J 

d<t> d 2 <f> d$ ay d(j> d 2 <£ 

dx dx 2 by dxdy dz dxdz 

so that 

2dx[\dx) + \dy) + \dz)\' 
(v - grad> « grad ( |- J» 


in the special case where curl v = 0. Further, we introduce a 

—Ap whose gradient 


grad II = -T- grad p = - grad p. 

Euler's equation for the steady state, where v is independent 
of time, then becomes 

F = grad 


As a result of this equation, we see that for irrotational flow to 
occur, F must be the gradient of a certain quantity, or F must be 
a conservative force, derivable from a potential. We may then 
se t F = — grad V, and Euler's equation becomes 

grad (f +11 + = 0, 

or, integrated, 

v 2 
V + II + ~- = constant. 

This is Bernoulli's equation. For the special case of an incom- 
pressible fluid, p is independent of p, so that n is equal to -• In 
that case the equation may be written 

pV + p + \pv l — constant. 
Bernoulli's equation is essentially an energy integral, the term 
P V representing the potential energy per unit volume, p the 
contribution to the energy resulting from the pressure, and 
|py 2 the kinetic energy per unit volume. As we have stated, 
Bernoulli's equation, supplemented for a compressible fluid 
by the relation giving density as function of pressure, determines 
the pressure at each point of space, when the velocity and external 
potential are known. For instance, if there is no external force 
field (V = 0), we see that the pressure decreases at points 
where the velocity is high, which means at points where the tubes 
of flow narrow down. 

121. Viscous Fluids. — In Sec. 119 we mentioned the fact that 
ideal fluids support no shearing stresses. This, however, is not 
true of viscous fluids. Imagine a viscous liquid flowing hori- 


zontally, the lower layers dragging along the bottom, and the 
velocity increasing with height, so that v x = v x (y), other compo- 
nents of v are zero, if the xz plane is horizontal, y is vertical. 
Then if we imagine a horizontal element of area in the liquid 
at a certain height, the material above the element of area will 
pull tangentially on the material below it on account of viscosity, 
thus exerting a shearing stress. Experimentally, this stress, 
which is X V1 is proportional to the rate of increase of horizontal 
component of velocity with height: if k is the coefficient of 

viscosity, X v = k-~- This is a special case of the general laws 

governing stresses in a viscous medium, connecting the stresses 
with the rates of change of the velocity components with position. 
In the last chapter we have given the general form of Hooke's 
law, the law giving stresses in an elastic medium in terms of the 
strains. By analogy we can set up the relations for a viscous 
fluid, but now the stresses are proportional, not to the strain 
components themselves, but to their time derivatives. By 
comparison with Eq. (5), Chap. XVI, we see that k takes 
the place of the shear modulus, and that the component of 
strain d£/dy + drj/dx must be replaced by its time derivative, 
dv x /dy + dVy/dx = dv x /dy in our special case, since v y = 0. 
This tells us how in general we are to change Hooke's law for 
the case of viscous incompressible fluids. We place dv x /dx + 
dVy/dy + dvjdz = divv = 0, corresponding to d£/ dx + dy/dy + 
d£/dz = in the strains, replace n by k and insert the time deriva- 
tives of the strain components. Thus -we have the following 
relations between the stress and strain components for liquids : 

r.--p + *£ ; . r, = *(t + t) 
*--* + »£ *.-<!? + Sr) <« 

where we have included the ordinary pressure of the liquid in 
addition to the viscous stresses. Inserting the values of the 
stress components in the equations of motion (2) of the previous 
chapter and remembering that for an incompressible fluid we 
have the continuity equation div v = dv x /dx + dv y /dy -f- 


dv z /dz = 0, there follow the general equations of motion for 
viscous liquids: 

"-% + **.-<% 

or in vector form: P F - grad p + ky 2 v = p-£, differing from 

Eq. (5) by the term ky 2 v. 

122. Poiseuille's Law. — Suppose we have an incompressible 
liquid flowing in a steady state in a horizontal cylinder of radius 
R parallel to the long axis of the cylinder (x axis). We have 
v v = v z = and since there are no body forces X = Y = Z = 0. 
The equation of continuity becomes dv x /dx = so that v x is a 
function of y and z alone. Then dv x /dt = v x dv x /dx + v y dv x /dy + 
v z dv x /dz = 0. Furthermore, if we take the divergence of the 
fundamental equations of motion, we have : 

P div F — div grad p + &v 2 (div v) = p -r (div v) 

Now by the equation of continuity div v = 0, and in our case 
of no external forces this reduces to 

div grad p = v 2 2> = 0. 

In our problem dp/dy = dp/dz = 0, so that d 2 p/dx 2 = 0. 
The pressure is thus a linear function of x, so that we have a 
constant pressure gradient in the tube. Of the three equations, 
only the first is left : 



and since dp/dx is constant = a, and we have cylindrical sym- 
metry, this reduces to 

1 d_ ( dv x \ _ a 
r dr\ dr / k 

where r is the distance from the axis of the cylinder. Integrated, 
this yields v x = jrr 2 + 6 In r + c, and since v x is finite for 

- T,( Q2Vx _L d * Vx \ 


r = o, & = 0. If the liquid clings to the walls of the cylinder, 
v x = when r = R, so that we find 

v x = ±{r*-m. (8) 

Thus the liquid flows in cylindrical tubes of constant velocity. 
This type of motion is called "laminar" motion. The velocity 
varies parabolically across a diameter of the cylinder. 

The amount of liquid flowing per second through a cylindrical 
ring of thickness dr, radius r, is 

dQ = 2irrv x dr 
so that the total discharge rate of such a cylinder is 

Q = 2.J rv. dr = — ^- = m (p, - Pl ) (9) 

where we have placed the constant pressure gradient a = 

— PLZLPJ. This law, known as Poiseuille's law, furnishes a 

very nice experimental method of determining the coefficient 
of viscosity of liquids. 


1. Liquid is confined between two parallel plates, so that it flows in two 
dimensions. At a certain point, a pipe discharges liquid at a constant rate 
into the region. Find the velocity potential, and velocity, as a function of 
position. Show by direct calculation that the flow outward over any 
circle about the source is the same. 

2. A shallow tray containing fluid has a source at one point, an equal 
sink at another, so that liquid flows in two dimensions from source to sink. 
Find the equation of the equipotentials and the lines of flow, prove they are 
circles and plot them. (Suggestion: since the equations are linear, the 
potential or flux due to two sources is the sum of the solution for the separate 

3. Prove that — (1/r) is a solution of Laplace's equation. Investigate 


the lines of flow connected with this as a potential. Draw the lines, in the 
xy plane. What sort of physical situation would be described by this case? 

4. Consider an ideal fluid at rest. It is subjected to an impulsive pressure 

(p) = I pdt , where r indicates the interval of time during which the pressure 

is applied. If no body forces act on the fluid, prove by integrating Euler's 
equations, that the impulsive pressure divided by the density of the fluid 
equals the velocity potential of the ensuing motion. This is the physical 
significance of a velocity potential. 


5. Show for a liquid in equilibrium under the action of gravity that the 
pressure varies linearly with the depth below the surface. Calculate the 
total force exerted on the surface of a submerged body by the liquid 
and show that the resultant force is directed upwards and is given in 
magnitude by Archimedes' principle. [Hint: If a vector has only one 
component different from zero, e.g., A x , then Gauss's theorem becomes 

f ^ dV = Ca x cos (n, x)dS.] 

6. The free surface of a liquid is one of constant pressure. If an incom- 
pressible fluid is placed in a cylindrical vessel and the whole rotated with 
constant angular velocity co, show that the free surface becomes a paraboloid 
of revolution. (Hint: Introduce a fictitious potential energy to take care of 
centrifugal force and use the hydrostatic equations.) 

7. A gas maintained at constant pressure p, flows steadily out of a small 
hole into the atmosphere, pressure p . Assume the density constant. Find 
the expressions for the velocity of efflux and for the force exerted on the gas 
container due to the efflux. If the gas is oxygen at a pressure of 4 atmos- 
pheres in the tank, calculate the efflux velocity (1) with the density constant, 
and (2) taking into account the variation of density with pressure, assuming 
an adiabatic expansion. 

8. With the help of Gauss's theorem prove the theorem of the last chapter 
that the stress tensor is symmetric. 

9. Calculate the rate of discharge of a cylindrical pipe standing vertically, 
the liquid flowing in laminar flow under the action of gravity only. 

10. A perfect gas at constant temperature is in equilibrium under the 
action of gravity. Find the relation between the pressure of the gas and 
the height above the surface of the earth. 

11. Carry through the derivation of the laws of motion of viscous fluids 
using the modified form of Hooke's law and the general equations of motion 
of an. elastic medium. 


The problem of heat flow, although of quite different physical 
nature from elasticity and hydrodynamics, involves similar 
mathematics. Indeed, Fourier was concerned with problems 
of heat flow when he developed the series known by his name 
which we have used so much in our study of vibrations. First 
we set up the differential equation governing heat flow in a 
manner similar to the reasoning of the preceding chapters. 

123. Differential Equation of Heat Flow. — The fundamental 
physical fact is that when there is a difference of temperature in a 
material body, heat will flow, and the rate of flow is proportional 
to the temperature gradient. Suppose we have a slab of thick- 
ness L, area a, with a difference of temperature Ti — Ti between 
the faces. Then the amount of heat flowing per second across 

the face is ~ — , where k is the thermal conductivity, 

the negative sign meaning that if Ti > T h the flow will be back- 
ward toward low temperature. In the limit of an infinitely thin 

slab, this is simply — ka— , if £ is the coordinate measured in the 

direction of the heat flow. Next, there is the fact that if heat 
flows into a region, its temperature rises, the amount of rise being 
given by the relation that the amount of heat flowing in equals 
the change of temperature times the heat capacity, which in turn 
is the specific heat c times the mass. Putting these together, we 
obtain an equation which states the following: the rate of heat 
flow into a body is proportional to the time rate of change of its 
temperature; or, looking at it in another way, it is proportional 
to the temperature gradient around its boundaries. By eliminat- 
ing the heat flow, we obtain a differential equation for the 

Our first principle, which we have stated in the form that 


— ka— measures the heat flow across the area a perpendicular 

to the x axis, is evidently a special case of the general law that 



the flux density of heat flow is / = — k grad T. This incidentally 
shows us at once that, if A; is a constant, / is derivable from a 
potential, in this case kT, so that the curl of the flux is zero. The 
surfaces of constant temperature are called isothermals, and they 
serve as equipotentials, the lines of flow being at right angles to 
the isothermals. The equation of continuity now states that the 
time rate of increase of heat per unit volume equals the rate at 
which the heat flows in over the surface, plus the rate at which 
heat is produced inside. To raise the temperature of unit volume 
one degree requires an amount of heat equal to the heat capacity, 
or cp, if c is the specific heat, p the density of matter. Thus the 
time rate of increase of heat is cp times the time rate of increase 
of temperature. We have then 

where P means the rate of production of heat per unit volume. 
By Gauss's theorem, the second term becomes — / / Jdiv / dv, so 
that for a small volume we have 

Cp ~dt = ~ div / + ' P - 


cp^ = k div grad T + P = kV*T + P. (1) 

This is the equation of heat flow. At a point where heat is not 
being produced, it reduces to 

an equation similar to the wave equation as far as the dependence 
on space is concerned. It contains, however, a first rather than 
a second time derivative, and this results in solutions which are 
exponentially damped, like a particle with resistance but no 
restoring force, rather than oscillating solutions. The particular 
case where the temperature is independent of the time, the steady 
state, leads simply to Laplace's equation, the term in time 

124. The Steady Flow of Heat. — The isothermals and lines of 
flow for the steady flow of heat are determined from Laplace's 
equation, and in some elementary cases we can find them with 
great ease. First let us consider a one-dimensional flow, which we 


obtain with a slab of a substance, like a window pane, assuming 
that the temperature varies only with the coordinate x normal to 
the surface, being independent of y and z. Laplace's equation 
becomes d 2 T/dx 2 = 0, so that T = a + bx, with a constant 
temperature gradient. Thus if a face at x = is kept at tem- 
perature T , the other face at x = L at T h the temperature at 
intermediate points is given by T = T + (x/L){T x — T ). It 
is this simple case which furnishes the basis for the usual defini- 
tion of thermal conductivity. 

The cylinder forms a slightly more difficult problem in steady 
flow. For instance, let us ask for the steady state of temperature 
within a pipe formed of two concentric cylinders, whose inside 
and outside faces are kept at fixed temperatures. The tempera- 
ture will depend only on r, and will be determined, on account of 
the divergenceless nature of the flow, by the condition that the 
same amount of heat flows across the surface of any cylinder with 
radius intermediate between r and r\, the minimum and maxi- 
mum radii of the pipe. This amount of heat is the product of 
the normal component of the flow, which is f r = — k(dT/dr), 
by the area of the cylinder, which for unit length along the pipe 
is 2xr. In other words, 2wrf r = —2irkr{dT/dr) = constant, 
dT/dr = a/r, T = a In r + b. The two constants can be 
determined by fitting the temperatures at the two surfaces of the 
pipe. This example is interesting in showing that the tempera- 
ture gradient is not always a constant in the steady state. The 
reason is very simple : the tubes of flow are not of constant cross- 
sectional area, and thus with a divergenceless flow the number of 
lines of flow per square centimeter, and consequently the magni- 
tude of the temperature gradient and flux vector, must change 
from point to point. The same thing is evident in the flow of 
heat in a sphere, where the flow through concentric spheres must 
be the same. Hence, since the areas of these spheres increase 
proportionally to the squares of the radii, the temperature gradi- 
ent must be inversely proportional to the square of the distance 
from the center, and the temperature inversely as the first power. 
These relations are just like those of the field and potential of a 
point charge in electrostatics, and as we shall later see, for just 
the same reason: both are solutions of Laplace's equation. 

125. Flow Vectors in Generalized Coordinates. — Complicated 
problems in the steady flow of heat, as in hydrodynamics and 
electrostatics, are best approached by introducing curvilinear 



coordinates, so that the boundaries of the bodies are expressed 
by coordinate surfaces, as with the cylinder and sphere. Thie 
suggests the formulation of the equation of steady flow, or 
Laplace's equation, in such general coordinates. Let the coor- 
dinates be g x , g 2 , qz and let them be orthogonal coordinates, so 
that the three sets of coordinate surfaces, q\ = constant, g 2 = 
constant, g 3 = constant, intersect at right angles. Now let us 
move a distance dsi normal to a surface q\ = constant. By doing 
so, g 2 and g 3 do not change, but we reach another surface on which 
gi has increased by dqi, which in general is different from dsi. 
Thus, with polar coordinates, if the displacement is along the 
radius, so that r is changing, ds = dr; but if it is along a tangent 
to a circle, so that 6 is changing, ds = rdd. In general, we have 
dqi = hidsi, dq 2 = h 2 ds 2 , dq z = h s dsz, (3) 

where in polar coordinates the h connected with r is unity, but 
that connected with 6 is 1/r. The first step in setting up vector 
operations in any set of coordinates is to derive these A's, which 
can be done by elementary geometrical methods. 

126. Gradient in Generalized Coordinates. — The component 
of the gradient of a scalar S in any direction is its directional 
derivative in that direction. Thus the component in the direc- 
tion 1 (normal to the surface gi = 

constant) is -r- = hi- — For in- 
dsi dqi 

stance, in polar coordinates, the r 
component is -z-> and the 6 com- 

+ d< *5 



1 dS 

Fig. 32. — Element of volume for 
vector operations in curvilinear 

r dd 
127. Divergence in Generalized 
Coordinates. — Let us apply 
Gauss's theorem to a small volume 
element dV = dsids^dsz, bounded 
by coordinate surfaces at q lt q x + dq h etc. as in Fig. 32. If 
we have a vector A, of components A h A 2 , A 3 along the three 
curvilinear axes, the flux into the volume over the face at q h 
whose area is ds 2 ds s , is (Aids^dss)^, and the corresponding flux 
out over the opposite face is (Aidszdssj^+dqj, where we note that 
the area ds 2 ds 3 changes with q x as well as the flux density A\. 

Thus the flux out over these two faces is ^— (Aids 2 dsz)dqi = 



•-— ( t—J- )dqidq 2 dqz = hjiji-i-r—i — |- )dV. Proceeding similarly 
dqAJiJiz/ dqi\h 2 li3/ 

with the other pairs of faces, and setting the whole outward flux 

equal to div A dV, we have 

*, a = «44(^) + 4(^) + UM <*> 

128. Laplacian. — Writing the Laplacian as div grad <£, and 
placing Ax = gradi </>, etc., in the expression for div A, we have 

*♦ = div grad * = M*[^ ^J + ^ ^J + 

^yd^Jj (5) 

It can easily be verified that this formula leads to the same values 
for the Laplacian in special cases which we have already obtained 
by direct differentiation in Chap. XV. But now we can under- 
stand the formula better, for we see that the terms like hi/h 2 h 3 
appearing inside the first differentiation arise from the fact that 
the flux through the opposite sides of a volume may differ not 
only on account of variation of the flux density, but also because 
the Opposite sides can have different areas, as they do in the small 
volume element determined by coordinate surfaces with curvi- 
linear coordinates. 

129. Steady Flow of Heat in a Sphere. — Having obtained 
Laplace's equation in arbitrary coordinate systems, the problem 
of solving for the steady flow of heat becomes that of solving 
Laplace's equation in a suitable system, subject to certain bound- 
ary conditions. For instance, suppose we know that the surface 
of a sphere, radius r , is kept at a temperature independent of 
time, though depending on the angles 6 and <£. We then can 
set up the steady distribution of temperature within the sphere 
by solving Laplace's equation in spherical coordinates. The 
problem is mathematically like that of Problems 6, 7, and 8, 
Chap. XV, the vibration of a sphere, if we seek a solution inde- 
pendent of time. Just as in those problems, we separate vari- 
ables in Laplace's equation, obtaining solutions of the form 
sin m4>Pi m (cos 6)R, where the P's are called associated Legendre 
polynomials, and where R satisfies the equation 

ll( r2 dR\ 
r 2 dr\ dr / 

2 dR\ 1(1 + 1) 

R = 0, 


which can be immediately solved by setting R = r n , where n is an 
integer to be determined. Substituting, this leads at once to the 
equation n(n + 1) = l{l + 1), which has two solutions, n = I 
or n = — (I + 1). In the present case, where the function must 
stay finite within the sphere, at r = 0, we cannot have inverse 
powers, so that the only allowable functions are r l . Other 
problems solved by the same method, however, as for instance 
those of the electrostatic fields of distributions of charges, often 
involve functions which may become infinite at r = but remain 
finite at large r's, and they must be expanded in the series of 
inverse powers. We now have for a general solution 

^^(Ami sin m<f> + B m i cos m<£)Pr(cos d)r l . 

I m 

To get the coefficients of the various terms in the sum, we set 
r = r , and determine the coefficients so that the resulting func- 
tion of and <£ is the assumed temperature distribution. This 
amounts to an expansion of the assumed function in series in the 
orthogonal functions (sin m<f> or cos m<£)Pj TO (cos 0), and can be 
done by the usual methods for such expansions. 

130. Spherical Harmonics. — To understand the physical 
meaning of the various terms of the expansion, we should con- 
sider the spherical harmonics, or functions of angles. Solving 
for these as in the problems quoted above, we find for the first 
few functions the following values: 

I = 0, m = "0: constant 

I = l } m = ± 1 : (sin 4> or cos </>) sin 

m = 0: cos 
I = 2, m — ±2: (sin 2<j> or cos 2<t>) sin 2 

m = + 1 : (sin <t> or cos 4>) sin cos 

m = 0: 3 cos 2 0-1. 

These functions are shown graphically in Fig. 33, where the 
intersections of the nodal planes or cones with unit sphere are 
drawn. Thus the functions with I = 1 have one nodal plane, 
which may be perpendicular to any one of the three coordinate 
axes. This is seen most easily by remembering that x — 
r sin 6 cos <f>, y = r sin sin <f>, z = r cos 0, so that the three 
solutions of the problem corresponding to I = 1 (r times the 
functions of angle) are simply x, y, z. These are obviously 
solutions of Laplace\s ^nation, and have the nodal planes 



x = 0, y = 0, z = 0, respectively. Similarly by making linear 
combinations of these three functions, we obtain solutions having 
any desired nodal plane. This is analogous to the degeneracy 
in the circular membrane, discussed in Sec. 103. With I = 2, 
there are two nodal surfaces, and so on. For discussing the 
vibrations of a sphere, of course these nodes would represent 
the regions of no displacement, the material on one side. being 
displaced one way, the material on the other side in the opposite 

m=±1 m = 

Fig. 33. — Spherical harmonics. Figures represent nodal lines on the surface of 
a sphere, for the functions sin m^Pf 1 (cos 6) and cos m<t>Pi m (cos 0). Upper line, 
1=1; lower line, 1=2. 

direction. With heat flow, the separate terms represent simple 
types of steady temperature distribution. For instance, the 
terms with 1 = 1 represent spheres in which the surface tempera- 
ture varies as the cosine of the colatitude angle, or as the distance 
in a direction along the axis, and our solution tells us that in 
this case the temperature within the body varies linearly with 
distance, as in a flat slab. Higher terms represent more compli- 
cated solutions, and by superposing them any desired steady 
heat flow can be built up. 

131. Fourier's Method for the Transient Flow of Heat. — The 
simplest type of problem in the transient flow of heat is the 
following: At t = 0, a body has a temperature which is an 
arbitrary function of position. At that instant, it is plunged 
into a cooling bath of some sort, which instantly cools its sur- 
faces to a fixed distribution of surface temperature which is 


maintained after that. The problem is to find the temperature 
throughout the body as a function of time as it cools from its 
initial to its final steady state. This can be easily reduced to a 
simpler case. We write the temperature at any time as the sum 
of two terms, the transient solution, and the steady-state solu- 
tion. The latter is the temperature distribution set up by the 
cooling baths around the surface, and is discussed as in the last 
few sections in which steady flow of heat has been considered. 
The transient solution starts off with a temperature distribution 
which, added to the steady-state solution, gives the assumed 
initial temperature distribution of the body, and then gradually 
damps down to zero, finally leaving the steady-state solution 
only. Since at any instant after t = the steady-state solution 
by itself gives the correct boundary temperature about the sur- 
face of the body, we see that the transient must give zero tempera- 
ture at all points of the surface, independent of time. Thus 
the transient by itself is the solution of the problem in which a 
body is heated to an arbitrary temperature distribution at t = 0, 
after that is plunged into a cooling bath maintaining its whole 
surface at temperature zero, and gradually cools down to this 
temperature. We investigate this transient problem. 

First we take the one-dimensional case, again of a slab, in 
which the initial temperature is an arbitrary function of x, but 
at all times after t = the two faces, at x = and x = L, are 
maintained at T = 0. The heat-flow equation becomes 

d*T _cpdT = A dT = cp 

'dx T ~ k dt A dt k 

We solve this equation by separation of variables. If T = 
X{x)Q{t), and if we substitute in the equation and divide by T, 
we have 

1 d*X = AdG = _ C2 

X dx 2 6 dt 
Then separating we have 


dt^ A ' dx 

^ + ^ = o,^+c*x = o. 

The solutions are 

6 = e A , X = sin Cx or cos Cx. 

We see that the temperature decreases exponentially with the 
time, approaching a constant value, a very reasonable behavior. 


The boundary condition is now T = when x — 0, x = L, 
and we satisfy this as we would with the vibrating string: we 
take only sines, and only those which reduce to zero &t x — L; 
that is, we take sin (nwx/L), where n is an integer. In other 
words, C = mr/L, so that the function is constant X e~ {n ' v ' /AL2)t 
sin (mrx/L), and the whole solution, writing in the value of A, is 

^^ zr n "*\ ■ nirX ,a\ 

Let us assume that the temperature distribution at I = 
is T = f(x). Then we wish to find the coefficients K n , deter- 
mining the temperature at later times. At t = the exponentials 

go to 1, so that we have f(x) =- ^jK n sin -j — We can then find 

the coefficients K n by Fourier's method, so that the problem is 
solved. The qualitative nature of the solution is easy to see. 
The original shape of the temperature curve will be distorted 
as time goes on, since the terms with high n damp down more 
rapidly than the others. After a certain lapse of time the whole 
slab will have become cooler, but also with a more simple tem- 
perature distribution, approximating the single term with n = 1. 
Thus, for instance, if it is originally all at a constant high tem- 
perature, and then is cooled, the original temperature curve 
would rise discontinuously from at the edge to a constant 
value T inside. But after a time the curve would be like a single 
loop of a sine curve, showing that the edges would cool more 
rapidly than the middle. 

The transient flow of heat in bodies of other shape may be 
considered by extensions of the same method. Thus the transient 
flow in the cylinder or sphere can be handled by introducing 
cylindrical or spherical polar coordinates, and separating vari- 
ables just as for the vibration problems. The solutions, as far 
as the coordinates are concerned, come out as with vibrations, 
leading, for example, to sines and cosines of the angle, and Bessel's 
functions of r, in the case of two-dimensional flow in a circle or 
cylinder, but the time enters as a real exponential damping down 
to zero, rather than a complex exponential or sinusoidal function. 
Special cases are discussed in the problems. 

132. Integral Method for Heat Flow.— There is another, differ- 
ent, method of great use in discussing the transient flow of heat. 



This method is based on an important particular solution of the 
heat-flow equation. If we consider again the one-dimensional 
flow, and let a 2 = k/cp, we can easily show that the function 

f(x - x', t) = 




is a solution of the equation, where x' is an arbitrary constant. 
To prove this, it is only necessary to substitute in the differential 

Fig. 34.— Function f(x - x', t) of Eq. (7), as function of a;, for different Vs. 
The function represents temperature distribution at different times resulting from 
initial conditions where the temperature is infinite at x' , zero- elsewhere. 

equation. The graph of the function /, plotted against x for 
different values of t, as in Fig. 34, has a sharp maximum at x = x', 
looking like the familiar Gauss curve for probability distributions. 
At t = the curve is coincident with the x axis everywhere 
except at x = x' , where it forms an infinitely high and narrow 
mountain, so that the area under the curve is finite. As time 
goes on, this mountain becomes flatter and broader, until finally 
the function is zero everywhere. 

The function / can be used to discuss the following problem : 
At t = the temperature throughout an infinite body is given 
by a function T (x), and we are interested in the way in which this 
temperature distribution changes with time. We can break up 
the problem into a sum of other simpler problems, by dividing up 


the distance x into small intervals, by a succession of points x h 
x 2 • • • x n . We set up the following problems: 

1. The initial temperature is T (x ) between x and x h but 
is zero elsewhere; 

2. The initial temperature is T (xi) between, xi and x 2 , but 
is zero elsewhere; 

n. The initial temperature is T (x n -i) between z n _i and x n , 
but is zero elsewhere. 

The initial temperature distribution connected with one of 
these problems would be similar to the curve of Fig. 34, for very 
small value of t, in that it would be large in a very small region, 
negligible or zero elsewhere. To make the maximum come at the 
right place, we must choose x' for the ith problem equal to x t . 
As time goes on, the function / gives a good approximation to the 
way in which the temperature in this simple problem changes. 
Now if, at t = 0, we add together all the temperatures of Probs. 1 
to n, we get the correct initial distribution of temperature. 
Therefore, if we add all the solutions at a later time, we again 
get the solution for the whole problem. This, of course, actually 
becomes an integral, the element of the integrand connected with 
the interval dxi, which equals x i+ i - x if being proportional to 
T (xi)f(x - Xi, t)dxi. As a matter of fact, the constant of 
proportionality in / is so chosen that this gives just the right 

T(x, t) = /^ T (x') f{x - x' } t) dx'. (8) 

To prove this, we need to do two things: first, prove that it is a 
solution of the heat-flow equation; secondly, show that it 
approaches the correct value at t = 0. The first is obvious, for 
the integrand, regarded as a function of x and t, has already been 
shown to be a solution of the equation, and on account of the linear 
nature of the differential equation a sum of solutions is a solution. 
For the second, we note that at t = the function fix — x' , t) has 
appreciable values only at x = x'. The whole integral will then 
come from the immediate neighborhood of x' = x, so that we 
may insert this value in T , and take it outside the integral sign, 

T(x, 0) = T ix) f^fix - x' t 0) dx'. 


X i* jo x — x 

The integral is — j=. I e ~ u ' 1 du, where u = „ ~ i and this equals 

unity. Hence we have shown that T{x, 0) = T (x), so that we 
have verified our solution. 

By a slight variation, it is possible to solve the problem in which 
the temperature of a semi-infinite slab bounded by x = is 
initially any desired value, and in which the surface is kept at 
T = at all subsequent times. Let the initial temperature be 
T (x), where this function is defined only for positive x's, inside 
the slab. We now define an odd function equal to T (x) for 
positive x's, equal therefore to — T ( — x) for negative x's. If 
we set up an infinite slab with this temperature distribution, 
then on account of symmetry the temperature at x = will 
always be zero, and our boundary condition is satisfied, the part 
of the solution for positive x's being the desired function. 

Integral methods similar to that described can be used also 
to discuss the problem in which the surface of a semi-infinite 
slab is kept at a temperature which varies in an arbitrary way 
with time. Two- and three-dimensional problems can also be 
treated, though the principles are not essentially different from 
those already considered. 

One interesting feature of heat flow is brought out by the 
integral solution which we have just used. That is its irreversi- 
ble nature. Thermodynamically, heat conduction is a typical 
irreversible process, and this is shown in the fact that heat 
always flows from the warmer to the cooler body, never in the 
opposite direction. With reversible processes, as for instance 
vibration problems, one can change the sign of the time where it 
appears in the solution and still have a possible solution of the 
equation ; a vibration running backward is not essentially 
different from one running forward. But that is not the case 
in the heat-flow equation, as we see easily from Eq. (7), where, 
if we attempt to give t a negative value, the solution becomes 
imaginary. The essential mathematical difference between the 
two cases is that in heat flow a first time derivative appears, 
while in vibration problems and wave equations there is a second 
time derivative. This second time derivative is unchanged 
when t is changed to —t, whereas the first time derivative in 
the heat-flow equation changes sign with t, so that, if a given 
function satisfies the equation, it will no longer satisfy it if time 
is reversed. 



1. Derive the divergence, gradient, and Laplacian in spherical polar 
coordinates by the general method of this chapter. 

2. Discuss the steady flow of heat in a spherical shell contained between 
two concentric spheres, the temperature being an arbitrary function of 
position over both surfaces. 

3. Discuss the steady two-dimensional flow of heat in a semi-infinite 
rectangular bar bounded by x = 0, x = L, y = 0, extending to infinity 
along the y axis, subject to the boundary condition that the temperature 
is zero along the two infinite sides of the bar, but that it is an arbitrary 
function of x along the end from x = to x = L. Build up the solution 
out of individual solutions varying sinusoidally with x, and exponentially 
with y, noting that they must decrease rather than increase exponentially 
as y increases. 

4. Discuss the steady flow of heat in a semi-infinite cylindrical rod with a 
flat end, if the temperature is kept at zero along the cylindrical face, but is 
an arbitrary function of position on the end. 

5. A slab is heated to a uniform temperature Ti, then plunged in a bath 
which keeps its temperature at TV Find the interior temperature as a 
function of the time, computing and drawing several graphs, so chosen as 
to show the progress of the cooling process. 

6. For small times after the cooling process has commenced in Prob. 5 ? 
the interior temperature will not have changed appreciably, and the slab 
will act practically like a semi-infinite slab. Compare the solution of 
Prob. 5, using Fourier's method, with the corresponding solution by the 
integral method, computing both curves and comparing. 

7. In an infinite body the temperature is initially unity between the 
planes x = — 1 and x = 1, and is zero everywhere else. Plot the tempera- 
ture as a function of x for several instants of time, and finally for t = «> . 

e-» 2 dtt.) 

8. Prove that the integral f °° e~ ui du = %^-- (Suggestion : Multiply this 

integral by the equal integral [ e~ v2 dv, and consider u and v as Cartesian 

coordinates in a plane. Introduce polar coordinates in the plane, carrying 
out the integration in those coordinates.) 

9. Show that a particular integral of the equation for heat flow in an 

infini + e medium is constant c ~I^, where r is the distance from the origin. 

Discuss the initial temperature distribution corresponding to this solution. 

10. Show that the integral 

1 f /• /• _' 2 

T = 

■&3%sn rMT w "v* 

is a general solution of the heat-flow equation in three dimensions corre- 
sponding to an initial temperature distribution of T (x, y, z), where r 2 = 

(x - xV + (y - y') 2 + (2 - *') 2 . 



The problems of electrostatics are practically identical mathe- 
matically with those of flow, which we have been considering 
in the last few chapters. The fundamental physical law is 
very simple. Electric charges exert forces on each other, given 
by Coulomb's law, which states that the force is directed along 
the line of centers, and equal to ee'/r 2 , where e and e' are the 
strengths of the charges, r the distance between. The force 
on a particular charge is then given as the sum of the individual 
attractions and repulsions exerted by all the other charges. 
The force per unit charge at any point is the intensity of the 
electric field, a vector function of position. The lines tangent 
to the force vector, similar to the lines of flow in the last two 
chapters, are called the lines of force. 

133. The Divergence of the Field.— Consider the field of a 
point charge at the origin of coordinates. The field intensity 
E is a vector of magnitude e/r 2 , pointing out along the radius; 
its components are thus 

ex ey ez 

We then have 

div E = ~(—\ + JL(?M\ -l JL(?*\ = 
dx\r 3 J dy\r 3 J dz\r 3 ) 

3 3(*> + W)-l 


\ r 3 

We thus see that the field of a point charge is divergenceless. 
In other words, if we represent the field strength by the number 
of lines of force per square centimeter, these lines will never start 
or stop in empty space. They will, of course, start or stop on 
charges. We cannot see this directly, but we can prove it by 
using Gauss's theorem. Take a small sphere of radius R about 
the origin. Then we know that the volume integral of the 
divergence of E over the volume equals the surface integral 



of the normal component of E. This component is e/R 2 , and 
the surface area is 4ir# 2 , so that the surface integral in question 
is 4xe. Thus the volume integral of the divergence over our 
small volume is iire, which is different from zero. Since the 
number of lines emerging across an area equals the field strength, 
the total number of lines of force diverging from the charge e 
is also ire. 

Now consider the field of many point charges. The field of 
each charge separately has zero divergence. Therefore, since 
the divergence of the sum of several functions is the sum of the 
divergences, it is plain that the divergence of the whole field 
vanishes: div E = in general. The only exception is for those 
points where there is charge, for there we have seen that the 
divergence does not vanish. Let us see what does happen there. 
In the first place we introduce p, the volume density of charge. 
Now take a small volume dv, containing a charge pdv. Surely 
if dv is small enough this field will be just as if the same charges 
were concentrated at a point. Thus 4wpdv lines will diverge 
from the charge, or JfE n dS = div E dv = lirpdv. Dividing by 
dv, we have 

div E = 4tt P . (1) 

This is the general equation for the divergence of the field, and 
we see that it reduces to div E = at points where the charge 
density vanishes. This equation, div E = 4xp, is mathemati- 
cally equivalent to the continuity equation 

^=-div/ + P, 

if we set the time derivative equal to zero, and consider 4irp as 
the quantity analogous to the rate of production of material. 
Here, of course, there is no actual idea of flow, the analogy being 
merely mathematical. 

134. The Potential. — We can immediately show that the curl 
of the field of a point charge vanishes. And unlike the divergence 
equation, this is true everywhere, even right at the charge. Then, 
-if we superpose many charges, the curl still is zero, so that we 
have the general equation curl E = 0. This holds in all static 
cases (we shall later have a term to add to the equation, contain- 
ing a time derivative). Thus we can always set up an electro- 
static potential <f> t such that E = - grad <£. Taking the divergence, 
we find the equation which the potential satisfies: it is 


-div grad <j> = -V 2 <f> = 4rp, (2) 

which is called Poisson's equation. Laplace's equation V 2 <£ = 
is the special case which holds in those regions of space that 
contain no charge. 

If we form the line integral of the electric field intensity along 
a given curve between two points of the field, A and B, then 
JE ■ ds along this curve is called the electromotive force along the 
path. It is obviously the work per unit charge done by the field 
when a charge is moved along the given path from A to B. 
In the electrostatic case, since E can be obtained from a potential, 
E = —grad 4> and 

rB r*B 

E.m.f. = I E • ds = — I grad <l> • ds = 

-f(s* + s*ts*) 



so that in this case the e.m.f. is equal to the potential difference 
between the points A and B. The distinction between e.m.f. 
and potential difference is of importance in cases where curl E 5* 
and hence there is no potential. Even in this case we may still 
use the idea of e.m.f. 

135. Electrostatic Problems without Conductors. — There are 
two principal sorts of electrostatic problems. The first is that 
in which we know the distribution of charge, and wish to compute 
the field. We could always do this by direct summation of the 
fields due to the individual charges, but often that is very difficult, 
and we can simplify greatly by using the potential and Laplace's 
equation. Thus suppose we have charge uniformly distributed 
over an infinite plane, the amount per unit area being <r, and 
suppose we wish the field at a distance R from that plane. We 
may get this by a direct calculation. Thus we take a set of polar 
coordinates in the plane, which center at the point directly 
beneath the place where we wish the potential, as in Fig. 35. 
Between the circles of radius r and r + dr, and between and 
6 -f dd, will be an amount of charge ardddr. This will be at a 
distance \/R 2 + r 2 from the point we are interested in, so that its 

field will have the magnitude „ 2 — ^ The component normal 



to the plane, which is all that we need, is this times 


(R 2 + r 2 ) 3 ^ 

\/R 2 + r* 

The total field is then 

x dx 

(i + x*y< 

where x = - B - 

Fig. 35. — Field of a charged plane. From charge between r and r + dr, 6 and 

+ d$: 


E n = 


R* + r 2 ' ~" (fl 2 + r*)?2 

Letting 1 + x 2 = y, so that xdx = dy/2, this is 
'dy _ 2ira l 

2ir<r J 


= 27TCT. 



Thus the field is a constant, independent of position. Similarly 
on the other side of the plane it is — 2x0-, so that there is a dis- 
continuity in E of 4x0- in crossing the surface. 

We have seen that it is possible in such a simple case to compute 
the field directly. But it is done far more easily by using our 
general principles. Thus the potential can depend only on the 
coordinate normal to the plane, which we denote by x. Its 
differential equation, outside the charged sheet, is then 

^ = 


<j> = ax + 6, 

showing that the field is constant everywhere, and in the x direc- 
tion. To investigate conditions on the surface, we set up a thin 

flat volume, with its broad sides parallel 
to the charged plane, and enclosing just 
1 sq. cm. of this plane. It will then hold 
charge <x, so that 4xcr lines will diverge 
from it. By symmetry, these will leave 
it at right angles, and an equal number 
over each face. Hence 2x0- will leave over 
each face, or the field strength is 2x<r on 
the one side, — 2xo- on the other. We 
have the same result as before, with 
very much simpler calculation. 

Similar problems are met in the theory 
of the condenser. Take, for example, 
parallel the parallel plate condenser, as in Fig- 
36, two charged plates of area A , so large 
in proportion to their separation d that they can be almost treated 
as infinite. Let the charge per square centimeter be <r on one 
plate, — o- on the other. Then we must find the potential 
difference between the plates, for by definition the capacity C = 

^j.- But now, just as in the last case, the field must be constant 

and perpendicular to the plates. It can have different values in 
the three regions to the left of the plates, between, and to the 
right. And it has a discontinuity of 4x<r in passing through a 
plate of surface density <r. These conditions are all satisfied by 
having no field outside the condenser, and by having a field 4xo- 
within, pointing from the positive plate to the negative. Thus 



< d 















- -6 


36.— Field 
plate condenser. 


the potential difference, being the field times the distance, is brad, 
so that 

C = -^ = A (3) 

the familiar formula for a parallel plate condenser. It should 
be noticed that capacitance has the dimensions of a length in 
the electrostatic system of units. 

136. Electrostatic Problems with Conductors. — The second 
sort of electrostatic problem is more difficult. It is that in which 
there are conductors as well as charges. Now in the presence of 
a charge, induced charges are set up on conductors, and it is 
usually a difficult problem to find how they are distributed, and 
hence to find their field. In this case it is practically indis- 
pensable to make use of the methods of potential theory. To see 
how to proceed, let us imagine the train of events which would 
occur when a charge was brought near a conductor. The charge 
would carry with it a field, which in general would be such that 
different parts of the conductor were at different potentials. 
Now a conductor has the peculiarity that if there is a field in it, 
a current flows, and continues to flow as long as the field remains. 
Thus charge will start to flow through the conductor, being 
attracted or repelled by the external charge. This will continue 
until just such a charge distribution has been set up in the con- 
ductor that the field resulting from it plus the external charges 
reduces to zero within the conductor, or the potential throughout 
the conductor is constant, for this is the condition for no current 
flow. In other words, the whole of a conductor, surface and 
inside, is part of a single equipotential. We then solve such a 
problem in the following way: we look for a solution of Poisson's 
equation, holding in the region outside the conductors, and reduc- 
ing to constants on the boundaries. This solution thus gives the 
potential of the problem, and its gradient gives the field. 

We can illustrate better by a problem. Consider an infinite 
conducting plane, uncharged as a whole, with a charge e in front 
of it at a distance d. Now we wish a solution of Poisson's equa- 
tion, reducing to a constant over the face of the plane. We set 
this up by a device, called the method of images. We imagine 
the plate removed, its face replaced by an imaginary plane, and 
at a distance d behind the plane we put a charge —e, as if it were 
e's image in a mirror, as shown in Fig. 37. Then these two 



charges together would keep the whole plane just at potential 
zero. For any point of the plane is equidistant from both charges, 
one has the potential e/r, and the other — (e/r), and they just 
cancel. The potential at any point of space can be easily found, 
now, in the field of these charges. It is simply 

-- l 

if r x is the distance from the charge e, r 2 the distance from its 
mirror image. The lines of force and equipotentials look like 

\ \ \ 



! >' 


/ / / 1 \ \ 


W \ ^ 

/ / 
/ I 

/ l \ 
/ \ v 

Fig. 37.- 

-Lines of force for charge e in front of conducting plane, by method of 

those of a bar magnet, and it is perfectly true that the plane 
bisecting the magnet is an equipotential. In our actual problem, 
now, the potential in the empty space is just that given by our 
field of two charges; in the metal the potential is zero. 

We might naturally inquire what induced distribution of 
charge would be set up in the conducting plane, to produce this 
final field. In the first place, in a steady state, the charge within 
a conductor is always zero. For the field is zero within it, there- 
fore its divergence is zero. Thus all charge is concentrated on 
the surface. Next, as we showed before, the normal component 


of the electric field has a discontinuity of 4tto- at a surface carrying 
a surface charge a. Thus if we can compute the discontinuity, 
we can in turn get the surface density of charge. In our case the 
field is normal to the plate, by symmetry, so that the discontinuity 
of E n in crossing the surface is just equal to the total E outside. 
This may be found at once from our known potential function, 
so that we could get the necessary surface charge. 

137. Green's Theorem. — The fundamental theorem of 
potential theory is a mathematical relation called Green's 
theorem. It is a result of Gauss's theorem, and is easily proved. 
Gauss's theorem states that J// div E dv = fJE n dS for any 
vector E. Now let E = <f> grad ^, where <£ and ^ are two scalar 
functions, then div E = div O grad yp) = 4>VV + grad <f> • grad ^, 

as we can easily prove. Also E n = <£— » where — is the normal 

derivative, the component of the gradient along n. Hence we 

f f f(*VV + grad 4> • grad *p)dv = I 1^-^dS. (4) 

This is one form of Green's theorem. To get the more familiar 
form, we next write just the same expression with <£ and yp 
- interchanged : 

f f f (*v"V + grad <£ • grad +)dv = f |V ^ dS. 

Now we subtract, obtaining 

J JJW - **♦)*- JJ(* f n - *£)*?. (5) 

This is the common form of Green's theorem. We shall now 
consider a number of applications of this mathematical theorem. 
These applications come mostly in the discussion of methods of 
solving Poisson's and Laplace's equations. Of course, these can 
be solved by the method of separation of variables, and develop- 
ment in series of orthogonal functions. But the present method, 
called Green's method, is quite different, and almost more useful 
in a general discussion, though perhaps not in particular problems. 

138. Proof of Solution of Poisson's Equation. — We can easily 
see how to solve Poisson's equation, V 2 = — 4?rp. For this 
gives the potential <f> due to a charge distribution. Now if we 


divide space into small elements of volume dv, the charge pdv 
will exert a potential pdv/r, if r is the distance from the point 
where we wish the potential due to dv. Thus the whole potential 


But p = — -t-v 2 $> so that we have 

if If 5 ?* «> 


giving the solution of Poisson's equation. In this integral, 
we must integrate over all space, so as to include all charges. 
We have derived our solution rather intuitively from the known 
solution for a point charge. But we can derive it rigorously 
from Green's theorem. 

In the last form of Green's theorem, let \p = 1/r, where r 
is the distance from a point P, and let </> be the potential <j>. 
Thus we have 

This is true no matter what volume we use. Let us choose as 
our volume the whole of space, except for a tiny sphere of radius 
R surrounding the point P where we wish to compute the poten- 
tial. Now v 2 (lA)" !?= 0> except where r = 0, so that it is zero 
throughout the whole of our volume, and the left side becomes 

— I I I dv. Let us compute the right side. The integral 

is to be taken over the surface of our volume, which consists 
of our tiny sphere, and a surface at infinity, which for the present 
we neglect. Over the surface of the tiny sphere, the direction 
n is simply the radial direction, pointing in toward P (because 
it is directed out of the volume). We have 

d(l/r) = d(l/r) = 1 H = _H 

dn dr r 2 dn dr 

Then the right side is 

But on the surface of the sphere, r — R, so that this is 


Now ■ ffjp is J ust the mean value * of * over the surface » 



is the mean value of -^- But the I I dS is the area of the sphere 

= 4ttR 2 , so that our integral is 4tt<£ + ^ R ^> and the whole 
relation is, changing sign, 


r or 

If now R approaches zero, the last term vanishes, and <£ 
approaches <£, the value at the point P. Hence we have 

the solution of Poisson's equation which we wished to prove. 

There are several points to be mentioned in connection with 
this proof. In the first place, the volume integral is taken over 
all space, except an infinitely small sphere surrounding P : a point 
charge exerts an effect on all other charges, but not on itself. 
Secondly, we neglected entirely the fact that our volume has 
a surface at infinity, which we should take into account in 
calculating our surface integrals. Suppose that the volume were 
not really infinite, but merely very large, being bounded, say, 
by a second large sphere of radius R' '. Then the surface integral 
over the large sphere is similar to that over the small one, but 

with opposite sign: it is Airfi + AnR'-^) where now 4>' is the 

mean over the large sphere, etc. To neglect these terms, as 
we have done, their limits must be zero as R' becomes infinite. 

Q if 

That is, $' must go to zero at infinite distance, and R'-^i must 

also go to zero. These are both satisfied if <t> is the potential 
of a set of charges at finite points, for then <j> will go as 1/r, 
6<p/dr will go as 1/r 2 , and r d<f>/dr will fall off as 1/r, becoming 
zero as r becomes infinite, 


<f> = 


139. Solution of Poisson's Equation in a Finite Region. — 

Suppose now that instead of extending our integral over all 
space, we integrate only over a finite volume V, with surface S, 
excluding in each case our infinitesimal sphere of radius R. 
Then plainly we have 

where the volume integral is taken over the whole volume V, 
excluding the infinitesimal sphere, and the surface integral 
is taken over S. 

We can explain this important formula in words much better 
than by mathematics. The potential at a given point can be 
written as the sum of two parts : the potential of all the charges 
within a certain finite volume surrounding the point, and another 
part, which, of course, must represent the potential of the other 
charges outside our volume. But the second term appears as 
a. surface integral, not a volume integral. This is an example 
of the usual sort of application of Green's theorem: the replace- 
ment of a volume integral by a surface integral. 

There is one interesting way of regarding the solution. Sup- 
pose first that p were zero all through our volume, though not 
outside. Then vV will be zero inside, and the volume integral 
will vanish. Further, will satisfy Laplace's equation within 
the region. The surface integral, in other words, represents a 
solution of Laplace's equation within our region, in terms of an 
integral' over the boundary of the region. As a matter of fact, 
any solution of Laplace's equation in this region can be written 
in this way, by using the proper boundary values of <j> and d<f>/dn 
at the surface. The last two terms in our solution, in other 
words, represent a general solution of the homogeneous equation 
V 2 ^ = 0, the arbitrary functions (which with partial differential 
equations replace the arbitrary constants) being the boundary 
values of <£ and d<f>/dn. The volume integral, on the other hand, 
represents a particular solution of the inhomogeneous equation 
VV = — 47rp, satisfying the equation but not its boundary 
values. Thus we have the familiar case in which the solution 
of an inhomogeneous equation is the sum of a particular solution, 


and the general solution of fae related homogeneous equation. 
And this general solution is to be so chosen that the sum of both 
terms satisfies the boundary values, on the surface of the volume. 
140. Green's Distribution. — When we examine the surface 
integral of Eq. (7) more in detail, we can see what it represents. 

The term -i f 1-^-dS represents evidently the potential arising 
J J r d n 

1 r)ih 

from a certain surface charge, of surface density ^ — • The 
other term, -r_( f ^j, dS, is a little complicated. The 

term ' ■ is the difference between the potential of two unit 


charges, spaced at a distance dn along the normal, divided by 
dn; that is, it is the potential of two charges, one of strength 
1/dn, the other — 1/dn, at distance dn, 
as in Fig. 38. Such a combination of an 
equal and opposite positive and nega- 
tive charge very close together is called 
a dipole. The strength of a dipole, or r j y£+ Ar 

the dipole moment, is the strength of 
one of the charges times the distance of 
separation. Thus in our case the 
strength is (l/dn)dn, so that we have 
the potential of a unit dipole. Then „**•!* •* 

^ . , - j. ! Fig. 38. — Potential of unit 

the integral is the potential of a dipole dipole, consisting of charges 
distribution of moment <£/4ir per unit + JL at distance dn apart. 
area. Such a distribution is called a dn 
double layer, since it consists of layers of positive and negative 
charges close together. We then see that by spreading on the 
surface of our region a suitable layer of surface charge, and a 
double layer of dipoles, we produce just the same field inside 
that the external charges would give. This distribution of 
charge and double layer is called Green's distribution. 

Suppose that we know that a given function <£ satisfies Lap- 
lace's equation within a given region. Suppose further that 
we know its boundary value <£, and its normal derivative d<j>/dn, 
at all points of the surface of the region. Then we can at once 
write the solution of Laplace's equation having these boundary 
values. It is 


--=//(♦ ^ } -^)«. 

integrated over the boundary. This is obviously a very simple 
way of getting a solution of a differential equation satisfying 
given boundary values. In particular it is simpler than the 
methods we have used so far, in that we can apply it to any 
form of surface. 

There is a simple interpretation of Green's distribution. 
Suppose that the field within our volume were just what it is, 
but that outside the volume the field and potential were every- 
where zero. Then at the boundary there would be a discon- 
tinuity of potential and field. Now we have already seen that 
at a surface charge o- there is a discontinuity of field, ixa, so 
that at a discontinuity of the field there is a surface charge equal 
to 1/4tt times the discontinuity of the normal component of 
the field. Thus if the normal component of the field is zero 
outside, d(j>/dn inside, the surface charge is l/4jr d<j>/dn. This is 
just the surface charge concerned in Green's distribution. 
Similarly, at a boundary where there is a discontinuity of poten- 
tial, there must be a double layer, of moment per unit area 
equal to 1/4t times the discontinuity of the potential, as we see 
from a condenser of charge <r, dipole moment ad per unit area, 
potential difference 4tiW. This gives the double layer of Green's 
distribution. In other words, these surface charges and layers, 
plus the charges within the region, are just those necessary to 
give the potential its actual values within the volume, and to 
reduce it to zero outside. 

141. Green's Method of Solving Differential Equations. — 
We have seen in the present chapter a method, called Green's 
method, for solving differential equations, quite different from 
any we have met before, except the integral method of treating 
heat flow, which is very similar. The most characteristic 
part of the method is in the solution of Poisson's equation, as 
an integral of p/r over all space. Here we had an inhomogeneous 
equation, v 2 <£ = — 4xp. Suppose we let p = p x -f- p 2 -f- p 3 • • • 
where p; is equal to p in the ith. volume element dv i} but is zero 
elsewhere. Then we can write the equations v 2 <£i = — 4irpi 
y-20,, — _ 4 T p 2j . . . , for each of these, where pi is different from 
zero only in a very small region, so that the problem is practically 
that of a point charge, which we can solve. We add these func- 


tions to get the whole solution, according to Sec. 26, Chap. IV. 
This is the essence of Green's method, the separation of the 
inhomogeneous part of the equation into simple parts, each of 
which we can solve. The function 1/r, which is the solution 
for one of these problems, is called the Green's function. As a 
matter of fact, a general method of solving differential equations 
by means of Green's functions has been worked out, and it lies 
at the basis of much of the more advanced work on the theory 
of differential equations, particularly of the second order. 


1. Given a spherical distribution of charge, in which the density is a 
function of r. Prove that the field at any point is what would be obtained 
by imagining a sphere drawn through the point, with its center at the origin, 
all the charge within the sphere concentrated at the center, and all the 
charge outside removed. Apply to gravitation, showing that the earth acts 
on bodies at its surface as if its mass were concentrated at the center. 

2. Given a sphere filled with charge of constant density. Prove that at 
points within the sphere, the field is directly proportional to the distance 
from the center. 

3. A condenser consists of two concentric spheres, holding equal and 
opposite charges. Find its capacity. Similarly find the capacity of a 
condenser consisting of two long concentric circular cylinders. 

4. Compute the surface density induced by a charge on a plane conductor. 

5. In a certain spherical distribution of charge, the potential is given by 

g— ar 

• . Find the charge density as a function of r. Also find the charge 

contained between r and r + dr. This represents roughly the charge 
distribution within an atom. 

6. Prove div (4> grad \f/) = </>vV + grad <t> ■ grad \p. 

7. There are certain charges and conductors in an electrostatic field, 
whose potential is $. Show that the surface density of charge on the surface 

of a conductor is -t— — , where n is the normal pointing out of the conductor. 

Show that the electric field is normal to the surface of a conductor. 

8. It requires several volts energy to remove an electron from the interior 
of a metal to the region outside. Find how many volts, if the double layer 
at the surface consists of two parallel sheets of charge, a sheet of negative 
electricity, of density as if there were electrons of charge 4.77 X 10 -10 e.s.u., 
spread out uniformly with a density of one to a square 4 X 10 -8 cm. on a 
side, and inside that at a distance of 0.5 X 10 -8 cm. a similar sheet of posi- 
tive charges. Remember that 300 volts = 1 e.s.u. of potential. 

9. Discuss the potential and field of a dipole. 

10. An uncharged metallic sphere of radius R is placed in a homogeneous 
electric field of intensity E a . Calculate the potential at any point of space, 
and sketch the equipotential curves. (Hint: Solve Laplace's equation in 


polar coordinates taking the z axis as the direction of E . Note that there 
is symmetry about the z axis. Try a solution of the form 

4> = Fxir) + F 2 (r) cos 

with the conditions that 

Fi{r) — >0, as r — > <» 
i'V? 1 ) — > £V, as r — > oo 

and that <t> must be constant all over the sphere of radius R.) Solve the 
problem for the case that the sphere carries a total charge e. 

11. The equipotentials due to two point charges e and e' are given by 
e/r + e'/r' = C. Show that the surface becomes spherical if e is of opposite 
sign to e' and C = 0. Consider a spherical conductor coinciding with this 
surface which is grounded. This does not disturb the field, so that these 
charges give the field we would have if one of the charges were removed and 
the metallic sphere left there. Show that if a is the radius of the sphere and 
L the distance from the charge (outside the sphere) to the center of the 
sphere, the image charge inside the sphere lies a distance L' from the center 
such that a 2 = LL' and has a charge e' = (—ea/L). Show that the surface 
density of induced charge varies inversely as the cube of the distance from 
the charge outside the surface to the point of the surface under consideration. 




The static magnetic field resembles the electrostatic field in 
many ways. The intensity of the field due to a magnetic pole is 
equal to the pole strength divided by the square of the distance 
of the point at which the intensity is measured, so that magnetic 
poles display close analogy to electric charges. The intensity 
of this field H is defined as the force per unit magnetic pole, and 
this is measured in the system of units known as the electro- 
magnetic, as distinct from the electrostatic. We shall discuss 
the relation between these systems of units in a later section. 
The vector H satisfies the equation 

div H = 47r X density of magnetic poles, 

but here a very important difference appears; north and south 
magnetic poles never can exist alone. No matter how small one 
takes a volume element, the north and south poles just cancel, 
so that the total density of magnetic poles is zero. Hence we 

div H = 0. (1) 

Thus we must always deal with at least a pair of opposite poles, 
and here we always have a magnetic dipole, whose behavior is 
just like that of an electric dipole. The magnetic moment of a 
bar magnet is defined as the product of the strength of one of the 
poles times the distance of separation, and magnetic fields are 
measured by measuring the torque exerted on a suspended mag- 
net (magnetometer). Exactly as we have defined the electromo- 

E -ds, we can now define as the 

magnetomotive force J B H • ds. This is the work per unit pole 

done by the magnetic field as the pole is moved along a path from 
A and B. There is also a magnetic potential $ = —JH-ds, 
and in the field of permanent magnets JH • ds taken around any 

closed path is zero. 



142. The Magnetic Field of Currents. — It is when we come to 
consider the magnetic fields due to currents that we meet differ- 
ences from the electrostatic case. Suppose that we have a 
straight wire in which a steady current flows. The magnetic 
lines of force are concentric circles around the wire and it is clear 
that if we calculate the integral JH • ds following one of these 
circles, we shall not find that its value is zero for such a closed 
path. On the other hand if we evaluate JH • ds around any 
closed path which does not encircle the wire, it does vanish, and 
the situation is then analogous to the electrostatic case. These 
considerations hold for any closed circuit carrying a current. 
We can reduce our problem to an ordinary magnetostatic one 

by the following device : suppose that 
we construct a surface bounded by 
the wire carrying the current and do 
not allow any of the curves along 
which we calculate JH • ds to cut 
through this surface. Then no closed 
paths are possible which encircle the 
current, JH • ds = around every 
Fig. 39.— Magnetic shell and path, and everywhere in space there 

multiple valued potential. The . ,. , ,. T _ o, 

potential difference between a 1S a magnetic potential $. Suppose 

and b is 47rmo, or Airi, where m we evaluate JH • ds along a Curve 

is the strength of the double , , . , . T , ,, 

layer producing the same mag- starting at a on one side of the sur- 

netic field as the current i in the face and following a line of force 

wire encirc mg e s e . around to a point b on the other side 

of the surface, as in Fig. 39. The difference of magnetic potential 
between a and 6 is given by 

3> a — $ 6 = — 

I H • ds = — I -r-ds. 

Ja Ja ds 

and the potential difference does not approach zero as we let 
a approach 6 since then the curve would cut our surface. This 
must mean that there is a jump in potential as we cross the sur- 
face. We have already seen in the last chapter that a surface 
distribution of dipoles (a double layer) produces a discontinuity 
in potential, so that we can replace our current by a surface 
layer of magnetic dipoles on a surface whose boundary is the 
current-carrying wire, and produce exactly the same magnetic 
field as the current. Suppose that we have a surface of area 
A on which we have a dipole layer of constant moment m 


per unit area. (This may be either an electric or magnetic 
dipole layer). Consider a point P outside the surface. If one 
looks from P to the surface, the surface subtends a solid angle 
0. It is easy to show that the potential at P is equal to m Q 
times ft. The proof of this is left to a problem. In particular 
if P approaches the surface, 12 approaches 27r so that the potential 
at a point just one side of the surface is 2irm . Similarly on the 
other side of the surface the potential is — 2xm , so that there is 
a discontinuity of potential equal to 4rm as one crosses the 
double layer. Thus in our case we have 

$ a — $& = +4xm 

(the + depends on which way we go around the curve ah), so 

fH • ds = ±4rm 

around a closed curve which cuts through the double layer sur- 
face and is zero for every other closed curve. In the following 
we shall always go around the curve in such a direction that 

JH • ds = 4rm . 

If we now ask how m depends on the current, we must get the 
answer from experiment and the relation turns out to be exceed- 
ingly simple; the magnetic moment per unit area m is propor- 
tional to the current. If we have not as yet defined the unit 
of current we may place m = i, and this equation defines the 
unit current in the electromagnetic system of units. Thus 

fH-ds = 47rt (2) 

where the integration is carried once around a path encircling 
the wire. If we go around again the value of the integral 
increases again by 4x2, and so on for every complete circuit of 
the path. This unit of current which we have introduced is 
called the abampere and is ten times as large as the practical 
unit, the ampere. On the other hand, we might wish to utilize 
the electrostatic unit of current, defined as the current in which 
one electrostatic unit of charge passes a given point per second. 
It is necessary to determine experimentally the proportionality 
constant between ra and i. This has been done and turns out 
to be 1/c, where c = 3 X 10 10 cm. per second. If we express 
our current in electrostatic measure, the work done in carrying 


a unit north pole around a circuit enclosing the current is 


H-ds = — • (3) 

The e.s.u. of current is }4 X 10~ 9 ampere. 

143. Field of a Straight Wire. — We can illustrate these ideas 
easily in the case of a straight wire carrying a constant current. 
Since the lines of force are circles, let us calculate the work done 
in carrying a unit pole around such a circle of radius r. In this 
case H has the same value all along the circle and is tangent to it. 

£H-ds = Hjds = 2*rH = 4m 

so that the magnetic field intensity at a distance r from a straight 

wire carrying a current i is 

H = -• (4) 


We can now set up the potential for 
this case. 

Thus, let the wire be along the z 
axis, as in Fig. 40, so that the field is 
given by 

Fig. 40. — Magnetic lines of „ _ ~2iy „ _ 2ix rr _ n 

iorce (circles) and equipoten- tLx — ' 2 ' "v ~ ~~^' n z — \). 
tials (radii) for the field of a 

wire carrying a current (at Th y H = 0, as we can im- 

right angles to the paper). A ^ ' 

H is perpendicular to radius, mediately prove by substitution. 

Therefore #*: H y =-y:x. rj^^ for examp le, CUT \ Z H = 

d(2ix/r*) _ d(-2iy/r*) = 2t _ W 2t _ 4^ = 

dx dy r 2 r 4 ^ r 2 r 4 

can have a potential, and it is easy to see that it must have as its 
equipotentials the lines 6 = constant, where 6 is the polar angle 
in the xy plane, since these are at right angles to the lines 
of force. If we set $ = — ~2id = — 2i tan -1 (y/x), we have 
-d$/dx = -2iy/r 2 = H x , -d<f>/dy = 2ix/r 2 = H y , so that we 
have actually exhibited the potential. 

But now we see that the potential is not single-valued. For a 
given value of x and y, the angle tan -1 (y/x) can have an infinite 
number of values, differing by 2x, and the potential can have 
an infinite number of values differing by 4xi, in agreement with 
what we found before. Thus the potential is not defined ir 


as simple and definite a way as in electrostatics. The interpreta- 
tion of this situation comes from a theorem called Stokes's 

144. Stokes's Theorem. — Stokes's theorem states that if we 
have any closed curve, and integrate the tangential component 
of a vector around it, the result is equal to what we obtain if 
we take some surface bounded by the curve, and integrate the 
normal component of the curl of F over this surface : 

JF, ds = / JcurL F dS. (5) 

To prove it, we first divide up the surface into small surface 
elements, of area dS. For one of these the surface integral is 

CUrl n F dS. NOW Suppose We Choose x ,y+ dy „ x+dx , y+dy 

the axes so that the z axis is normal 
to dS, and the area dS is bounded by 
x, y, x + dx, and y + dy as in Fig. 41. 

Then the surface integral is 


- j dxdy. Let us next compute 

d y J — .•/• — ' — "— w~+t>~»~ xy - x+dx,y 

jF s ds for the element of area. It is FlG - 41.— Circuit for proving 
evidently F x {x,y)dx -\- F y {x-\-dx,y)dy 

- F x (x,y + dy)dx - F y {x,y)dy = f -^ - —Jdxdy, if we go 

around so as always to keep the surface on the left. Thus the 
theorem is true for such an infinitesimal surface. But now, if we 
put the whole surface together out of its elements, the total 
surface integral will be the sum of the parts, or Jjcurl n F dS. Also 
the total line integral will be the sum of the integrals over the 
separate elements. To see this, we note that in making the sum, 
all boundaries except the outside edge of the area are shared by 
two elements of the area, and the line integral from one traverses 
the boundary in one direction, from the other in the opposite 
direction, so that the contributions all cancel, leaving only the 
integral over the outer boundary, which is then fF s ds. Thus 
Stokes's theorem is proved. 

145. The Curl in Curvilinear Coordinates. — It is often useful 
to have the curl, and Stokes's theorem, in curvilinear coordinates. 
We refer back to Chap. XVIII, using methods analogous to those 
used there in discussing the divergence and gradient. Consider 
an approximately rectangular area, similar to that in Fig. 38, 


bounded by q h q\ + dq x , q 2 , q 2 + dq 2 . The line integral about 
the circuit is Fi(q 1} q 2 )dsi + F 2 (qi + dq lf q 2 )ds 2 — Fi(q\, q 2 + 
dqi)dsi — Ftiqi, q 2 )ds 2 

F 2 (qi + dq h q 2 ) ^sfai, jfOL- .j"Fi(gi , g 2 + dq 2 ) 


— — j t«/2 r— 

/i 2 ai 2 I I hi 

Fi(q h qi) 


Since this must be curl 3 F dsids 2 , we have 

— •' -**[4&) - £(£)} (6) 

with analogous relations for the two other components. 

We can illustrate the formulas by showing that the curl of 
the field of a straight wire is zero. Let us take cylindrical coordi- 
nates, in which r = q u 6 = q 2 , z = gs, /k = 1, h 2 = 1/r, h s = 1. 
The assumed magnetic field, along the tangent, is H r = 0, 
H e = 2i/r, H z = 0. We then have H d /h e = 2i, a constant, so 
that its derivative is zero, and the curl vanishes. 

146. Applications of Stokes's Theorem. — Let us apply Stokes's 
theorem in a few cases. First, if the curl is everywhere zero, 
the line integral of the vector is zero around a closed path. It 
follows that the line integral from one point to another along 
any path is the same. This is the condition for the existence 
of a potential, and we now see that the vanishing of the curl 
is just the condition that we must have in order to set up a 
potential. But in the magnetic case, it is not true that the line 
integral around any path is zero. Any contour including the 
current has an integral different from zero. The whole situation 
is then explained if inside the wire carrying the current the curl 
of H is not zero, but is a vector pointing along the direction of 
the current, of such a magnitude that the total surface integral 
over the cross section of the wire is 4W. Thus, for example, 
a contour going once around the current has a surface integral 
of the curl equal to 4ri, which therefore must be the value of the 
line integral of the tangential component of H. 

To find the exact relation between the current and curl H, 
we imagine the current in the wire to be spread out through the 
actual material of the wire, as in fact it is. We set up u, the 


current density, or flux of electricity, satisfying the equation of 
continuity dp/dt + div u = 0. Then i = jju n dS, where the 
integration is over the cross section of the wire. We must have, 
then, 4x jju n dS = J/curL H dS, and since this must hold for 
any size wire, the natural assumption is that the same relation 
holds between the small elements of current, so that 4ru n = 
c\irl n H, or more generally 

curl H = 4tm. (7) 

Here u is in e.m.u. If it is in e.s.u., the equation is curl H = 
4nru/c. We can see one result of these equations. If the current 
instead of being in a single wire, is distributed through space, 
the curl is different from zero everywhere, and there is no possi- 
bility of writing a potential at all. 

147. Example: Magnetic Field in a Solenoid. — Suppose we 
have an infinite solenoid, of finite radius, with n turns per centi- 
meter, carrying current i, and that we wish to calculate the 
magnetic field inside it. We assume that it is in no external 
magnetic field, so that the field outside is zero. By symmetry, 
the field inside will point in the direction of the axis. Now let 
us apply Stokes's theorem to a path as follows: (1) Inside, along 
a line parallel to the axis, for 1 cm. The integral of H will be 
H i} the H inside, times unit distance. (2) Straight out, radially, 
to the outside of the solenoid. Since H is at right angles, the 
integral of H will be zero. (3) Outside, back for 1 cm. along a 
line parallel to the axis. The integral is zero since H is zero 
outside. (4) Straight in again, closing .the figure, and contribut- 
ing nothing to the integral. Thus we have jH s ds = H { . Now 
J/curU H dS = 4rfju» dS = 4r X total current flowing through 
the contour = 47rra. Hence we have H t = hrni, the formula 
for the magnetic field inside a solenoid, showing that it is constant 
independent of position. 

148. The Vector Potential. — In magnetic fields coming from 
permanent magnets, where there is no current, we can write an 
ordinary potential letting H = — grad <E>. But this is only 
possible when curl H = 0, which is not true in the presence of 
currents. On the other hand, it can be shown that if the diver- 
gence of a vector is zero, as div H = 0, it is always possible to 
set up a vector A, called the vector potential (to distinguish it 
from 4>, which is called a scalar potential), such that H = curl A. 
This is often a useful thing to do. We can prove readily that 
div curl A = always, so that we have div H = 0. 


The vector potential satisfies a simple differential equation. 
We know that curl A = H, but this does not determine A 
uniquely. In fact, to determine a vector field uniquely we must 
specify both its curl and its divergence, and we can find a vector 
whose curl and divergence are any desired functions. Let us 
then demand that div A = 0. We now have curl H = 4iru/c = 
curl curl A. It can be proved that curl curl A = grad div A — 
V 2 A = — v 2 i, since div A = 0. Hence 

V'A = -^ (8) 


similar to Poisson's equation for the scalar potential in terms of 
the charge density, 

V 2 <£ = — 47T/3. 

These two equations, expanded to include terms depending on 
time, prove to be very important in general electrical theory. 

Let us set up the vector potential for a current in a straight 
wire. Take cylindrical coordinates, with the wire pointing along 
the z axis. Poisson's equation for A is a vector equation, but 
since u has only a z component, A will likewise have only a z 
component, which will depend only on r. Thus we have 

1 d ( dA A 4xw , ^ n 

—A r-r- J = — ■ — - forr < R, 
r dr\ dr / c 

= for r > R. 

where R is the radius of the wire. 
The solutions of this equation are 

A z = + a In r + b for r < R 


= d In r + e for r > R. 

Since A cannot become infinite at r = 0, we must have a = 0. 
We may choose 6 = 0. Then d and e must be chosen to make A 
and its derivative with respect to r continuous at r = 22. Noting 
that ttR 2 u = i, the total current, this easily leads to 


A * = In r + constant for r > R. 


The only component of H is then H e , which is 

dr\ h z / J cr 


149. The Biot-Savart Law. — In the case of a linear conductor 
carrying a current i, the expression for the vector potential, 
using the solution of Poisson's equation from Chap. XIX, becomes 

a i i ds 

A = - I — ; 

c.l r 

where ds is the vector element of length taken along the con- 
ductor, and pointing in the direction of the current. To find the 
intensity of the magnetic field, we take the curl, finding 

H = curl A 

i C i ds 
— - I curl — 
cj r 

In this equation, ds is a vector and r a scalar. In general, if S 
is a scalar and B an arbitrary vector, it is easy to show that 

curl (SB) = S curl B + (grad S) X B. 

Applying this relation to our case, B = ds, and S = 1/r, and we 
must remember that in taking the curl we differentiate only with 
respect to the coordinates which fix the point at which we wish 
the value of H (the field point). Now these coordinates appear 
only in r and. not in ds, which depends on the circuit only. Thus 
the first term vanishes and we have 

— > 

where r is the vector from ds to the field point, and r is the length 
of this vector. If we imagine that the resultant H v is made up of a 
sum of contributions from each conductor element ds, we may 
write the law in its differential form 

dH = ^(dsX^y (10) 

This is known as the Biot-Savart law. The magnitude of dH is 

\ dH \ = £& sin d, (11) 

where 6 is the angle between the direction of ds and r; the direc- 
tion of dH is perpendicular to the plane of ds and r. Applied to 
closed circuits it always yields the same results as the integral 
law. For open circuits this is not obvious, since we can add to 
the expressions for dH a differential d\f/ provided fdip around a 


closed curve is zero. In this way we leave the law for closed 
circuits unaltered, but for open circuits change the value of H 
so calculated. Thus the integral law must be looked upon as the 
more fundamental. 


1. Prove that a double layer of moment m per unit area leads to a poten- 
tial <f> at point P equal to m fl, where Q, is the solid angle subtended by the 
area from the point P. 

2. Show that in the electrostatic system of units, charge has the dimension 
m^l^t- 1 , current the dimensions ra^V 2 , voltage (e.m.f.) the dimensions 
mfiifit-\ resistance the dimensions l~H, and capacity the dimensions I. 

3. Derive the dimensions of charge, current, voltage, resistance, and 
capacity in the electromagnetic system of units. 

4. Prove that if S is a scalar and B a vector 

curl (SB) = S curl B + grad S X B. 

5. Prove div curl F = 0; curl curl F = grad div F - v 2 F, where F is any 

6. Using the Biot-Savart law, find the magnetic field at any point on the 
axis of symmetry of a circular loop of wire of radius R carrying a current i. 

7. A current flows in a circular loop of wire, of radius R. Find the vector 
potential of the resulting magnetic field, at large distances compared with R, 
by adding the contributions to the vector potential due to the separate 
elements of current. 

8. Compute the field, from the potential of the last problem, and show 
that it is approximately the field of a single dipole. Find the strength of 
the dipole, in terms of current and radius R. 

9. Two parallel straight wires carry equal currents. Work out the 
magnetic fields due to the two together, in the two cases where the currents 
flow in the same or in opposite directions, drawing diagrams of the lines of 


10. Find the magnetic field at points inside a wire carrying a current, 
assuming the wire is straight and of circular cross section and that the 
current has constant density throughout the wire. 

11. Compute the curl in spherical polar coordinates. Verify directly 
that the divergence of a curl is zero in these coordinates. 




We now leave the restriction of the steady state and inquire 
into the extensions of the theory necessary to have it hold for 
nonstationary phenomena. The fundamental fact concerning 
electromagnetic induction may be stated as follows: If a set of 
circuits carrying current (or magnets and circuits) are set in 
relative motion with respect to each other, the currents in the 
circuits change during the relative motion. Instead of formulat- 
ing a law for the induced currents, it is simpler to consider the 
induced electromotive force. Take a closed circuit in the neigh- 
borhood of a moving magnet (or moving circuit), and let N be 
the number of magnetic lines of force through the circuit. Then 

the induced electromotive force is — -=r> expressed in electro- 
magnetic units, if N is in these units. If the e.m.f . is expressed 
in electrostatic units it is equal to -rr. The minus sign 

expresses what is commonly termed Lenz's law and indicates that 

if -rr is represented by a vector going through the circuit, the 

induced current flows in a clockwise fashion. 

150. The Differential Equation for Electromagnetic Induction. 

We can now state this law in more analytical form. Consider 
the closed curve formed by the circuit, and any surface whose 
boundary is this curve, so that the surface forms a sort of cap 
over the curve. Then the magnetic flux 

N = IJH n dS 
where the integral is carried out over the whole surface. Further- 
more the electromotive force is by definition the work done in 
carrying a unit charge once around the circuit. This work may 
be done either by the electric field or by chemical forces in a 
battery. Since the latter are considered absent we have 

e.m.f. = (pEtds 



where the integral is taken completely around the circuit. The 
fact that this line integral does not vanish shows us at once that 
we shall not be able to introduce a potential, as we have done 
in the electrostatic case. Thus we have 

$E s ds = -j t f CH n dS. 



It should be noticed that the flux of the magnetic field through 
the circuit may change in several ways, either by changing H n , 
or by changing the shape of the circuit, thus causing a change in 
the enclosed area, or by moving the undeformed circuit to other 
parts of space where H n is different. In general dN/dt is com- 
posed of several terms. In the case of fixed circuits, we may 
replace the total time derivative by the partial derivative so that 
dN/dt = dN/dt. With the help of Stokes's theorem we rewrite 
the induction law as 

This holds for any fixed circuit, and hence for any fixed area of 
integration. Thus it must hold for an infinitesimal area dS, so 
that the integrands must be equal and we obtain 

curl E = —-qj:' 

This is the differential form of the induction law. In it, E and H 
are both expressed in e.m.u. If E is expressed in e.s.u. and H in 
e.m.u., the law takes the form 

curl E = -i d i- (2) 

c at 

151. The Displacement Current.— We have now derived four 
fundamental electromagnetic equations: 
div E = 4ttp, 
div H = 0, 

1 °H 
curl E = -- -%> 


where E, p, and u are in e.s.u. and H in e.m.u. These aie aimost 
the Maxwell equations, but there is difficulty with the last of 


them. Of course, we have derived it on the basis of steady closed 
currents and for this case it is surely correct. The difficulty 
occurs when we try to apply this result to nonstationary cases. 
In the nonsteady state we have the new possibility of current 
flowing in "open" circuits. The simplest example is that of the 
discharge of a condenser. Here the current starts at the posi- 
tively charged plate, whose charge diminishes as the current flows 
to the negatively charged plate and annuls the charge there. 
Thus we can look upon the condenser plate as a source (or sink) 
of current. Now if we take the divergence of the last equation, 
we have 


div curl H — — diy u 


and since the divergence of any curl is zero, we find that div u 
equals zero, which means that the current is always closed and 
there are no sources or sinks. Thus open circuits lead to a 
Contradiction to this equation. We have derived the equation 
from steady-state considerations, however, and if we are to extend 
it to hold under all conditions, it is clear that there must be some 
term which vanishes for the steady state which we must add. 
The equation of continuity applied to electric charge and current 
tells us that 

div u + |? = 

expressing the fact that the flow of current out of a volume 
results in a decrease of charge in that volume. In the steady 
state dp/dt = 0, so that div u = 0, and we have no inconsistency 
with our fundamental equation. It is certainly clear that if 
curl H is to be proportional to a current, this current must be 
divergenceless, and u is not. Maxwell made the bold step of 
assuming that the whole current consisted of two terms u and u' , 
where u' was so chosen that div {u + u') = 0. In this way the 
distinction between open and closed circuits vanishes and a unity 
hitherto lacking was given to the laws. Maxwell saw at once 

1 r)W 

that we must set u' = -. — • For then we have 

Air dt 

div (u + u') = div [u -\- -. — \ = div u + -r- — (div E) 

\ 47T Ol I 4x at 

= div u + — = 



and this is the equation of continuity which we have been trying 
to satisfy. In other words, Maxwell assumed the correct equa- 
tion to be 

critf-iaf + *r tt . (4) 

c dt c 

1 riF 1 

The new term -: — is called the displacement current, in con- 

4t or 

trast to the convection current u. 

Actually the real advance of Maxwell over his predecessors 
lies in the introduction of this displacement current. The physi- 
cal meaning of this current can be obtained by considering the 
charging of a condenser. Current flows from one plate through 
the wire to the other plate. If the current is i, this equals the 
rate of increase of charge on the plate. Suppose the plates are of 
area A, separation d, then the field between them is 

E = iira = -j- X total charge' 

and the displacement current density in the region between the 
plates is 

^^ = a? = i^ (totalcharge) = r 

A rtW 

Thus the displacement current is j — -^ — i, and is equal to the 

convection current in the wire, so that the current becomes 
continuous throughout the circuit. The fundamental assump- 
tion of Maxwell was that the displacement current is always 
present when an electric field varies in time and produces the 
same magnetic effects as convection currents. 

It is clear that a test of Maxwell's hypothesis can only be made 

1 f)Tf 
with very rapidly varying fields, since we must make j — -rr > > 

u in order to keep the convection current effects from masking 
the displacement current effects. As is well known, Hertz, in 
1888, performed the experiments on electric waves which con- 
firmed this assumption of Maxwell. There is an interesting 
connection between the displacement current and the Biot- 
Savart law. All the attempts before Maxwell were to find a 
correct form of the Biot-Savart law for "open" circuits. As we 
pointed out in the last chapter, the addition of a total differential 
to this law would yield nothing when it was applied to closed 


circuits, and the hope was that the correct form to be added to 
this law could be found so as to account for open circuit 

152. Maxwell's Equations. — We can now write the correct 
Maxwell equations 

1 dE . 4xw , ^ 1 dH 

curl H = curl E = rr- 

c dl ^ c c ** 

div H = div E = 4xp 

These are the fundamental equations of electromagnetic theory. 
They need extension in but one way. If there are dielectric and 
magnetic bodies present, in them Coulomb's law and its analogue 
for the magnetic-field become 

* ~ — v 


F = 

/xr 2 

where e is the dielectric constant and m the magnetic permeability. 
We now introduce a new vector called the electric displacement 
D, defined by D = eE, where E is the intensity of the electric 
field. Similarly, we introduce the magnetic induction vector 
B = nH. It is easy to see from our previous work that we now 
have the relation div D = 4rrp. Furthermore, Faraday's induc- 
tion law refers to the rate of change of magnetic flux through a 
circuit and hence H must be replaced by B in this relation. 

1 /) 
Finally, we have div curl E = = — — div 5, so that div 

B = 0, rather than div H = 0. The final equations are thus 
found to-be: 

, rr 1 d'D , 4aru ' , e, .1 dB 

curl H = - -37 H curl E = — -^ 

C dt c C at 

div B = div D = 4ttp 

B = nH D = eE. (5) 

In these equations, E, D, p, u are in electrostatic units, H and 
B in electromagnetic units. In Chap. XXIV we discuss in detail 
the significance of B and D, and the interpretation of e and /*• 

Maxwell's equations sufl&ce to determine the field, when we 
are given the charges and currents. To make a complete set of 
dynamical principles, however, we need two more relations. 


First is the formula giving the force acting on a charge and 
current. The electrical force per unit volume is simply pE, the 
force on unit charge multiplied by the charge per unit volume. 
The magnetic force is that acting on the current, as observed in 
the ordinary action of the electric motor. This force acts at 
right angles both to the current and to the magnetic field, and 
is proportional, as is shown in the elementary study of electricity, 
to the current (in electromagnetic units) times the component 
of magnetic field at right angles to the current. For unit volume, 
this is just given by the vector product u X H. If u is in elec- 


trostatic units, it is - X H. Thus we have for the force vector 

F = P E + -{u X H). 

If the current density is produced by the motion of charge, we 
have u = pv, where v is the velocity vector of the charge. In this 

F --• p \e + ±(vXH)\ 

This relation has been particularly used by Lorentz. 

Finally, one must have a law, such as Newton's law stating 
that the force is equal to mass times acceleration, determining 
the motion of charge in terms of the force acting. With such a 
law, we find the field from the charge, the force from the field, 
and the motion from the force, obtaining therefore a complete 
system of dynamics. 

Let us now summarize the various steps gone through in build- 
ing up Maxwell's equations. Consider first the static case. 
Here dD/dt = dB/dt = and u = 0. The equations become 

curl H = curl E = 

div B = div D = 4tnp 

B = pM D = eE. 

The three equations on the left are those of magnetostatics, and 
the remaining three are those of electrostatics. Each system is 
completely independent of the other. The equations curl H = 0, 
and curl E = 0, show that scalar potentials exist. 

In the stationary case, we still have dB/dt = dD/dt = 0, but 
now u 5* 0. The only one of the equations above which is 
modified is curl H = 4iru/c } the others remaining unchanged. 


It is usual to include Ohm's law in the statement of the equations, 
however. This law is easily stated in differential form by 
considering a small volume, having length L in the direction 
of the current flow, and cross-sectional area A normal to the 
current. We apply Ohm's law in the form p.d. = iR. Here 
the potential difference is the field E times the length L of the 
volume, the current is the area times the current density u, 
and the resistance is the specific resistance times L/A. Hence 
we have 

EL = Au-j X specific resistance, 


u = <rE, (6) 

where <r, the specific conductivity, is the reciprocal of the specific 
resistance. This equation is Ohm's law in the form suitable 
for Maxwell's equations, and it is commonly included along with 
D = eE and B = jxH. 

If we now proceed to the nonstationary state we must strictly 
use the correct Maxwell relations. But there is a case of utmost 
practical importance, in which dD/dt <<C 4xw, and hence for 
which the effects of displacement can be neglected in comparison 
with those of the convection currents. The Maxwell equations 
with the displacement current omitted apply to the so-called 
"quasi-stationary" processes, and these form practically the 
whole domain of electrical engineering. The magnetic field 
inside and outside conductors is calculated as if produced only 
by the convection currents, but the induction law is not left 
out as in the stationary state. Here we have a double coupling 
of electric and magnetic fields, first, as in the stationary case, 
where electric currents produce magnetic fields, and, secondly, 
by the induction law. Since the essentially new contribution 
of Maxwell, the displacement current, is neglected in quasi- 
stationary calculations, it is clear that no study in that field 
can give experimental confirmation of Maxwell's idea. 

153. The Vector and Scalar Potentials. — We observe that, if 
H depends on time, curl E 9^ 0, so that there is no potential 
for E. The ordinary electrical potential is thus confined to static 

problems. Further, if u or — 5^ 0, there is no potential for H . 


We have seen in the last chapter how a potential can be intro- 


duced for H: one uses a vector potential A, possible because 
div H = 0. That is, we let 

H = curl A. (7) 

We can do this even in the general case. And it proves that we 
can use a scalar potential <£, reducing to the electrostatic poten- 
tial in the case of a steady state, but different in other cases, 
by a special device. The relation which proves to be satisfied 
is that 

E = -gmd <f> - j^, (8) 

reducing to the familiar E — — grad <£ when everything is 
independent of time. These relations are written for the case 
of empty space, where e = n = 1, and we shall give the discus- 
sion only for that case. 

To verify our statements about the vector potential A and 
the scalar potential <f> we substitute the expressions for E and H 
in Maxwell's equations, and see if they can be satisfied by the 
proper choice of A and <£. First, we notice that div H = div 
curl A = 0, so that this equation is automatically satisfied. 

1 A 

Next we take div E — —div grad <f> ^ div A = — v 2 <£ — 

. c dt 

1 ?j 

- — div A. This must equal 47rp. Now we consider the curl 
c at 

equations. We have curl E = —curl grad <t> ^ curl A. 

1 ?\ 
Since the curl of any gradient is zero, this is — — curl A = 

1 r)TT 

— > verifying another of Maxwell's equations. Finally 

curl H = curl curl A = grad div A — v 2 ^. This must equal 
1 BE , 4ttw Id , ± 1 d 2 A . 4ru „ . , 

c^t + —=—cdt grad *-?***- + — Hence > m order 
to satisfy Maxwell's equations, we must have 

-V 2 — div A = 4ttp, 

c at 

AAA 2 A . 1 ° A J. I 1 9 * A 4™ 

grad div A-v*A+~- grad * + - 2 ^ = — • 
But now let us choose A and <£ subject to the condition that 


1 dtb 

div 4 H ~ = 0. Since div A is so far arbitrary, we can do 

this. Then the first equation becomes 

and the second 

1 d 2 <f> 
W - -s -^ = -47TP, 

, , 1 d 2 A —4iru /m 

c 2 dt 2 c 

These are the equations for the potentials. If A and <£ satisfy 
them, then, as we stated before, the fields determined from them 

>/ ^ Qj± 
by the equations E = — grad <f> ^r-, H = curl A, satisfy 

Maxwell's equations. The equations for the potentials are 
of the form called D'Alembert's equation, and as can be seen 
are extensions of Poisson's equation, obtained by adding the time 
derivatives. We observe that in regions where there is no charge 
and current density, the potential satisfies the wave equation, 
which is the homogeneous equation obtained by setting the right 
side of D'Alembert's equation equal to zero. That is, <f> and A 
are given by functions representing waves traveling with velocity 
c. Hence the same thing must be true of the fields E and H. 
This is the origin of the theory of electromagnetic waves, and 
of the electromagnetic theory of light, and the proof that c, 
the ratio of the units, is at the same time the velocity of light. 
In regard to our condition imposed on the potentials, that 

1 fith 

div A -\ — - = 0, we can readily sh6w that if the potentials 

satisfy Eqs. (9) above, this condition can also be satisfied. For 
take 1/c times the time derivative of the first, and the divergence 
of the second, and add. Using the fact that div y 2 A = v 2 div A, 
where A is any vector, the result is 

' v°(div a+\>±)-^ *W + i/ca*/a<) 

That is, the quantity div A + ' --£ satisfies the wave equation 

c at 

everywhere. It can be proved that no function, other than 

zero, can satisfy the wave equation everywhere, unless its value 


at infinity is different from zero. Hence in an ordinary problem 
of charges at finite points, where certainly the potentials must 

1 r)th 

vanish at infinity, it must be that div A -\ ^ = 0, and in 

other cases we can certainly choose the potentials so that this 
condition will be satisfied. 


1 d 2 E 

1. Show that E and H satisfy the wave equations v 2 E - — 2 -^ =0, 

with a similar equation for H, in empty space, where u and p are zero, and 
e = M = 1. (Suggestion: for the first, take the equation for curl E, and 
take its curl, then substitute for curl H in terms of E. Proceed in an analo- 
gous way with the other equation.) 

2. In a region where u and p are zero, but e and p. are different from 1, 

3how that the velocity of light is —==■ 

3. A magnetic field points along the z axis, and its magnitude is propor- 
tional to the time, and independent of position. Find the vector potential. 
Assuming that the scalar potential is zero, find the induced electric field. 
Prove by direct integration using a circular circuit, that the law of induction 

4. Describe the magnetic field between the plates of a condenser while 
it is charging up. 

6. Starting from the induction law, show that the line integral of 

(E + - — ^ around a closed path is zero, where A is the vector potential, 
c dt / 
From this show that the curl of the above vector vanishes and hence that 

1 dA 

E = — erad 4> — » where <f> is the scalar potential. 

& C dt 

6. In conductors where p, = 1 and p = show that E and H both satisfy 
differential equations of the form 

,_ 4xcr dE e d 2 E _ _ 

v E ~ -& ~di ~ 7* ~w ~ °* 

7. Derive the differential equations satisfied by E and H for quasi- 
stationary processes. 

8. Show that if a voltage is induced in a circuit (2) by a changing magnetic 
field due to a circuit (1 ) , the induced e.m.f . in (2) is given by 

where' A i is the vector potential at the element ds 2 due to the current in 
circuit (1). For quasi-stationary processes we can write 

a M C C fWidVl 


where ui is the current density in circuit (1) and dvi a volume element 
thereof. For linear currents show that the induced e.m.f. is then given by 

'#**- -5 !('•#*£*)■ 

where Ii is the current in the first circuit, r i2 is the distance between dsi and 

The coefficient of mutual induction M i 2 is defined as 


so that the above relation becomes 

(E.m.f. ) 2 = -I j t (M 1 J 1 ). 


The idea of energy is as useful in electromagnetic theory 
as in mechanics. Maxwell's equations correspond in a general 
way to the equations of motion, and in the present chapter we 
introduce electrical and magnetic energies analogous to the 
potential and kinetic energies. The analogy is particularly 
close with the mechanical energy in a vibrating medium, since 
electrical oscillations in free space, as in a light wave, are similar 
to mechanical oscillations in sound. The energy of an elastic 
solid is distributed throughout the body, each volume element 
having a potential energy on account of its strain, and a kinetic 
energy on account of its velocity. Correspondingly we shall 
find that the electromagnetic energy can be considered as 
localized throughout the field, with a definite density of electrical 
and magnetic energy. Finally, the potential energy is propor- 
tional to the square of the stress or strain, and kinetic energy 
proportional to the square of velocity or momentum, and in a 
similar way here we shall find electrical energy proportional 
to the square of E or D, and the magnetic energy to the square 
of H or B. The analogy can be carried out completely, Maxwell's 
equations, for instance, being written in the form of Lagrangian 
equations; however, we shall not do this. We start the discus- 
sion by deriving the electrical and magnetic energy by elementary 
means from the condenser and solenoid, and then pass to general 
theorems involving energy density and energy flow. 

154. Energy in a Condenser. — Given a condenser of capacity 
C, let its charge at a given moment be q. Assume that we are 
charging up the condenser, and that we wish to know how much 
work we shall have to do on it to charge it. To take a small 
additional charge dq around the circuit, against the difference 
of .potential q/C, will require an amount of work (q/C)dq. Thus 
the whole work done in setting up a charge Q is 




This is the expression for the energy in a condenser which we 
found in Chap. V, Prob. 6. 

But now there is an interesting way in which we may consider 
this. We may imagine that the energy resides directly in the 
electromagnetic field, between the condenser plates. Let the 
area of the plates be A, the distance of separation d, and the dielec- 
tric constant e, so that C = Ae/4rd. Also the field between 
the plates will be E = q/Cd, the difference of potential between 
the plates divided by the distance. Hence we have \q 2 /C = 
\E 2 Cd 2 = (eE 2 /8ir)(Ad). But Ad is simply the volume of the 
condenser, or of the region of space where the field is E. Hence 
we may consider the energy to be located in the electromagnetic 
field, with a volume density eE 2 /8ir, and the integral of this over 
the condenser will give precisely the total energy. 

155. Energy in the Electric Field. — It is not difficult to show 
that in an arbitrary electrostatic field the energy is given by 

F = 2 

y _ eie2 

^- I I I E 2 dv. Let us consider two point charges d and e 2 

in a medium of dielectric constant e separated by a distance ri 2 . 
The force acting on each is given by Coulomb's law as 

er 2 i2 
and the potential energy of the system by 

l/ e 2 , n e x \ 
eri2 2\ er i2 eri 2 / 

We have written this, in two terms and notice that the first is 
just the charge ei times the potential at the point where the charge 
is due to the charge e 2 . Similarly the second term is e 2 times the 
potential at e 2 due to e\. Thus we can write 

V = -Rifinpi + e 2 <p2) 

where <pi and <p 2 are the potentials. In general for n charges we 


o / i ^kfPk (2) 

k ■ 

and if the charges are distributed in space instead of being point 
charges, this becomes an integral 

V = l[ [ [p<pdv (3) 


where p is the density of charge. Now, by Poisson's equation 
we know that vV = ~4xp/ e > so that the integral can be written 

= srj J J ' 

F 'srJJJ'" v *- 

We now make use of Green's theorem in its first form 

J7J>V 2 dv + J/Jgrad ^ • grad <t> dv = J/^ grad„ <f> dS 
where 4> and ^ are any two scalar quantities. Place \p = <j> — <p 
and this becomes 

JJJVvV dv + J/J^ 2 dv = J JV grad n <p dS 

since i? = —grad <p. 

Now since we integrate over all space, we must examine the 
behavior of the surface integral as the surface (a sphere of radius 
R, for example) gets larger and larger. The potential <p varies 
as 1/R for large R, grad* <p as 1/R 2 and dS is proportional to R 2 , 
so the whole surface integral vanishes as R-* <x>. Thus sub- 
stituting in our expression for V, we find 

V = £- I I I EHv (4) 



which is the equation we set out to derive. From our derivation 
it is easy to show that if e is not constant 

8 J J J 

V = ^ | | \E-Ddv 

where D is the electric displacement vector. This shows us the 
origin of the name for D. If we think of D as an ordinary dis- 
placement (per unit volume) of electricity, then the work done 
per unit volume is the scalar product of the force times the dis- 
placement. In an infinitesimal displacement dD, the work per 
unit volume is proportional to 

E -dD = eE-dE 
and for a finite displacement D we get something proportional to 

I EdD=\ eEdE = -j- =— g— 

Thus, except for the numerical factor 1/4*-, we have the potential 
energy per unit volume. 

156. Energy in a Solenoid. — In a similar way, we may con- 
sider the magnetic energy in a solenoid to reside in the magnetic 


field within the coil. We have found earlier that the energy 
in a solenoid of self-induction L, in which a current i was flowing, 
was \IA 2 . But now we can easily write this in terms of the field 
H within the solenoid. We have seen that this field is 4rra, 
where n is the number of turns of the coil per centimeter. The 
coefficient of self-induction L for a coil is easily found. By defini- 
tion, it is the e.m.f . induced when there is unit time rate of change 

of current through the coil. The e.m.f. per turn = ^ (B X 


cross-sectional area) = irr 2 n -rr> if r is the radius of the coil, n the 

permeability. Thus the e.m.f. for the whole N turns is JVrr*u 

— • Since H = 4xra = — r^> if N is the whole number of turns, 
dt d 

, , , . *r o (4*N)di ,, , T 47T 2 iVV 2 
d the length, the e.m.f. is iWr> — -5— -& so that L = ^— — 

1 T ., (27rWV 2 )/ ^ \ 2 M^V 2 ^ _ i*H*(irr*d) 
Hence we have ^ = a ^^/ g 8tt 

Since irr 2 d is the volume, this indicates a volume density of mag- 

. H 2 n 
netic energy 01 -^ — 

The proof that the total magnetic energy in a magnetostatic 
field is #- I ( I H 2 dv or ^ I I I # ' • B dv is carried out in 

exactly the same manner as the one for the electrostatic energy 
given in the last paragraph. 

157. Energy Density and Energy Flow. — The examples we 
have considered suggest that in a combined electric and magnetic 
field there should be a volume density (l/&r)(e.# 2 + nH 2 ) of 
electromagnetic energy. As a matter of fact, it proves to be 
quite possible to make this assumption, and to carry it out in a 
logical way. One can regard the electromagnetic energy almost 
as a fluid, having a certain density, flowing from place to place 
in the field. Thus, there is a flow vector associated with it, 
calle'd Poynting's vector, which we shall show in the next section 
to be equal to (c/4ac){E X H). We shall prove that there is an 
equation of continuity for the energy: 

div[^ Xff )] + ±[^ + ,ff*)]=0. (5) 

This is only true, however, in regions where electromagnetic 
energy is not being produced. Of course, energy as a whole is 


conserved, but there can easily be sources and sinks of electro- 
magnetic energy. Thus batteries are sources, in which chemical 
energy is converted into electrical energy, and resistances are 
sinks, in which the electrical energy is converted into heat. We 
imagine the field as being worked on by the battery, and as doing 
work against the frictional resistance. Hence our whole relation 
is that d/dt (electromagnetic energy) = rate of production of 
energy from e.m.f. per unit volume — rate of dissipation of energy 
into heat — div (energy flow). This equation, put in mathemati- 
cal form, is Poynting's theorem. 

158. Poynting's Theorem.— Let us compute the quantity 

d* [£ ( *xm] +![!(.*■ + ,*•>]. 

It can be shown in general that 

div (A X B) = B • curl A - A- curl B. 

Also —=t = 2 A • -— -• Hence the expression is equal to 

?-(h • curl E - E • curl H + *-E • *® + »H • ?**\ = 
4tt\ c dt c dtj 

M. \ c dtj \ c dtj 

But by Maxwell's equations curl E -\ — = 0, curl H — 

c dt 

- -~r = — , so that the result is —E • u. Hence Poynting's 
theorem is 

div ±(JE X H) 

+ it* E2 + * iH2) ] = - E - u ' (6) 

From the analysis of the last section, we see that —E-u must 
represent the total rate of production of electromagnetic energy 
by e.m.f.s minus the rate of dissipation into heat. The latter is 
simple : in regions where Ohm's law holds, u = <tE, so that here 
we have the contribution — aE 2 to the right side. The quantity 
<rE 2 represents the ordinary dissipation of energy into heat. We 
must examine the other sort of term, the external e.m.f., rather 
more carefully. 

159. The Nature of an E.M.F. — In a conductor carrying a 
current, there will be a current u set up, equal to the total force 


per unit charge, times <r. The force is ordinarily simply the 
electrical force E. But sometimes there are other sorts of force 
acting. For example, in a battery, the various concentrations of 
electrolytes produce a definite pressure on the ions, forcing them 
mechanically in one direction, and this force would not ordinarily 
be considered as being electrical in nature. Inside a battery, 
the electric field is actually opposite to the flow of current, point- 
ing from positive pole to negative, while the current flows from , 
negative to positive. But the additional force acting on the 
charges counteracts the electric field, and does enough more sO 
that it can push the current through the internal resistance of the 
battery. This latter part is already taken care of in computing 
the work done by the resistance. The former part, just equal 
and opposite to the E in the battery, is the force responsible for 
the applied e.m.f. of the battery. Thus it is — E per unit charge. 
And the rate of working of the force on unit charge is the force 
times the velocity of the charge. We actually wish the rate of 
working per unit volume, so that we must multiply by the charge 
per unit volume. This is p, and its product with the velocity 
is just the current density u, so that we have — E • u as the rate 
of working of the e.m.f. on the electrical system. This is just 
the contribution to the right side of Poynting's theorem which 
we should get inside the batteries. 

160. Examples of Poynting's Vector. — The conception of the 
energy of the electromagnetic field as residing in the medium is a 
very fundamental one, which has had great influence in the devel- 
opment of the theory. Thus Maxwell thought of the medium 
as resembling an elastic solid, the electrical energy representing 
the potential energy of strain of the medium, the magnetic energy 
the kinetic energy of motion. Such a definite view is no longer 
held. Nevertheless, the energy is always believed to travel 
through space. Thus, in a light wave, there is a certain energy 
per unit volume, proportional to the square of the amplitude 
(E or H). This energy travels along, and Poynting's vector is 
the vector which measures the rate of flow, or the intensity of the 
wave. We shall show that the vector actually points along the 
ray of light, the direction of flow. If, for example, we have a 
source of light, and we wish to find at what rate it is emitting 
energy, we surround it by a closed surface, and integrate the 
normal component of Poynting's vector over the surface. The 
whole conception of energy being transported in the medium is 


evidently quite fundamental to the electromagnetic theory of 

When we come back to charges and currents, however, it is a 
little harder to see the significance of the energy in the medium. 
For example, in a circuit consisting of a battery, and a wire con- 
necting the plates, Poynting's vector indicates that the energy 
flows out of the battery, through the space surrounding the wire, 
and finally flows into the wire at the point where it will be trans- 
formed into heat. This seems to have small physical significance. 
In a moving electron, the situation is somewhat more reasonable. 
Suppose that the electron at rest is to be represented by a sphere 
of radius R, on the surface of which the charge is distributed. 
Then the field will be e/r 2 at any point outside the sphere. The 

1 e 2 
total electrical energy is the volume integral of ^ - t over all 

space outside the sphere, or 

1 * f "^dr = 
&r Jr r 4 


In the theory of the electron, it is this quantity which is inter- 
preted as being the actual constitutive energy of the electron; 
although a correction must be made of an additional energy 
required to keep the sphere in equilibrium. Neglecting this 
correction, we can compute the mass of the electron. For a 
relation of Einstein says that a given energy has a mass, given 
by the relation, energy = rac 2 . Hence mc 2 = e 2 /2R. Solving 
for the radius, we have R = e 2 /2mc 2 , a familiar formula for the 
radius of the electron. The correct formula, inserting the correc- 
tion we omitted, differs only by a small factor. Inserting the 
correct values of e = 4.774 X lO" 10 e.s.u., m = 9.00 X lO" 28 
gm., c = 3 X 10 10 cm. per second, we have R = 1.41 X 10~ 13 
cm. Now if this electron moves, it will have a magnetic field, 
as a current would, and hence will have a certain magnetic energy. 
Since the magnetic field is proportional to the velocity (or the 
current), the magnetic energy is proportional to the square of the 
velocity. This can be shown to be the kinetic energy. Further, 
there will be a Poynting vector, pointing in general in the direc- 
tion of travel of the electron, and representing the flow of energy 
associated with the electron. All these relations prove on closer 
examination to be more complicated than they seem at first sight; 
but they lead to a consistent theory of the nature of the electron. 



It should be stated, however, that this theory does not fit in with 
the quantum theory, and that its correct form on the basis of 
that theory is not known at present. 

161. Energy in a Plane Wave. — Let us compute the flow of 
energy in a plane wave of light. It was shown in the last chapter 
and its problems that the potentials and fields satisfy a wave 
equation of the form 

V & - "2 ~M ~ U > 


c* dt 2 

corresponding to propagation with the velocity v = c/n, where 
n = Vw- Here n, the ratio of the velocity of light in empty 
space to the velocity in the y 
medium we are interested in, is 
the index of refraction. It is 
easy to set up a plane wave solu- 
tion of the wave equation. 
Thus a wave of frequency v, 
propagated in a direction whose 
direction cosines are/, g, and h, 
is represented by 

E = E e L c J . 

E is a constant vector, measur- 
ing the amplitude of the wave. fig. 42.- 
The exponent is constant, rep- 
resenting constant phase, or a 
wave front, when/r + gy + hz = (c/n) t = vt. Now/i + gy + 
hz is the projection of the radius vector x, y, z on the direction 
/, g, h, so that, as we see from Fig. 42, all points for which fx + 
gy + hz is constant lie on a plane whose normal is /, g, h, and 
whose distance from the origin is given by the constant. If this 
constant is vt, the plane travels out with a velocity v, as a wave 
front should. To have a wave of arbitrary phase, E would have 
to be a complex vector. We can immediately show by substitu- 
tion that the wave as we have written it is a solution of the wave 
equation. For instance, dE/dx = -(2Trivnf/c)E, and carrying 
out the various differentiations and substitutions, and making 
use of the relation f + g 2 + h 2 = 1, the result follows at once. 

Having the form of the solutions for E and H, we may apply 
Maxwell's equations. We note that the wave equations separate 

-Plane wave front AB, satisfy- 
ing equation 
fx + gy + hz = constant = distance OB. 


E and H completely, but Maxwell's equations prescribe rela- 
tions between them, so that actually Maxwell's equations are 
more restrictive than the wave equation. First, we cannot hope 
to satisfy the relations unless E and H both have the same 
exponential factor, corresponding to the same frequency and 
wave normal. Assuming this to be true, we can apply the equa- 
tions in succession. Let us first take div D = 0. This leads at 

once to — • (fE x + gE y + hE z ) = 0, showing that the scalar 

product of unit vector along the wave normal, which we may 
call k, and E, is zero. In other words, E and D have no compo- 
nent along the wave normal, or are in the plane of the wave front. 
Similarly div B = shows that B and H are in the plane of the 
wave front. Next take the curl equations, beginning with 

curl H = - — . This gives for its z component 
c at 

-*™p{gH z -m y ) = I (2*iv)E x , 

which is the x component of 

E = -% X H), = -^0 X H), 

showing that E is at right angles both to H and the wave normal, 
these three then forming a set of three orthogonal directions. 
Further, since k and H are at right angles, the magnitude of E 

equals V 'n/e times the magnitude of H. The fourth equation 
can be easily shown to lead to the same condition. 
Now we find the energy density. It is evidently 

1 eE 2 

as we see from the relations between E and H. Setting 
E = E cos 2wv\ t (fx -}- gy .+ kz) , and squaring, we have 

a quantity oscillating with time, but its time average, which 
alone has physical significance, is E Q 2 /2. Hence the mean energy 
density is eE 2 /Sr. Next, Poynting's vector, being at right 
angles to E and H, is along k, as it should be. Its magnitude is 
(c/4t)E X s/T/JxE, so that its mean is (c/8t)\/«7m-^o 2 , or 
c/\/efi times the energy density. But this is the result we 


should expect. This energy would be contained in a volume 1 sq. 
cm. in cross section, and of length v = c/ve/* cm. But if the 
light moves with a velocity v along the long axis of the volume, 
this energy will cross the 1 sq. cm. in one second, so that it should 
represent the flow vector, or Poynting's vector. 

162. Plane Waves in Metals. — Let us consider the propagation 
of a plane wave in a metallic conductor, where for simplicity 
we shall take fi = 1, p = 0, but u = aE. Rather than satisfying 
the wave equation first and then substituting in Maxwell's 
equations, as we did in the preceding case, we shall vary the 
procedure by assuming a wave with undetermined velocity, 
and satisfying all four of Maxwell's equations (in the preceding 
case only three of Maxwell's equations, and the wave equation, 
were actually used, Maxwell's fourth equation being auto- 
matically satisfied). Let us then assume that E and H are given 
by expressions of the form 

E e L C J , (9) 

where a is to be determined. The divergence equations show as 
before that E and H are both in the plane of the wave front. 
The equation for curl E leads to a(n X E) = H, showing as 
before that E and H are at right angles to each other, and that 
the magnitude of H is a times the magnitude of E. The equation 

curl H = - -rr H E gives a new condition, 

c at c 

-^(* xh) = (^iv + *™ta. 

c \c ' c J 

This condition likewise shows that E and H are at right angles to 

each other, but now gives the magnitude of H equal to -( e -J 

times the magnitude of E. These conditions are only consistent 

«-!(._ ^), «. - « - 3fe. (io) 

a\ v / v 

We see, in other words, that a, the quantity corresponding to 
the index of refraction, is complex. Let us write a = n — ik f 
where n and k are real, so that, as we can easily see, 

n 2 — k 2 = €, nk .= <r/v } 



n = [Kv V + 4<r 2 A 2 + 6)]*, (n) 

fc = [KV« 2 + 4<r 2 /v 2 - €)]*. 

To understand the meaning of n and k, we substitute in the origi- 
nal expression for the plane wave, Eq. (9). This can be written 

E e' 

-^^(fx+gy + hz) 2*i»f t--(fx+OV + hz)~\ 
c p L c J 

The second factor is just like an ordinary plane wave, with index 
of refraction n, though since n depends on frequency, we find the 
Maxwell theory predicting dispersion of electromagnetic waves 
in metals. But the first factor, a pure exponential term decreas- 
ing as fx-\-gy + hz increases, means that there is a decrease of 
amplitude and energy as the wave travels along, or an absorption, 
as we can easily see from an application of Poynting's theorem, 
computing the Joule heating within the metal. For this reason 
k is called the absorption coefficient. 

We have found that the magnitude of H is a, or n — ik, times 
the magnitude of E. If we write the complex number n — ik 
in the exponential form, we have 

n - ik = \/n 2 + k 2 e^ 2 ™\ 

where 8 = K — tan -1 -> and 
2irv n 

17-71 n /, , m - 2j T*(fx+gV+hz) 2*iJt-;(fx+gy+hz)-8~\ 

\H\ = EzS/n 2 + k 2 e c e L c J , 

so that there is a phase difference between E and H in a conductor, 
whereas in an insulator they are in phase. The details of the 
calculation of electric and magnetic energies are left to a problem. 


1. If the generation of heat per cubic centimeter in a conductor carrying 
a current is crE 2 , prove that for a cylindrical conductor of resistance R, 
carrying a current i, the rate of generation is i 2 R. 

2. Given a cylindrical wire carrying a current. Find the values of E and 
H on the surface of the wire, computing Poynting's vector, and show that it 
represents a flow of energy into the wire. Show that the amount flowing 
into a given length of wire is just enough to supply the energy which appears 
as heat in the length. Note that the surface of a wire carrying current is 
not an equipotential so that there can be a component of electric field 
parallel to it. 

3. Prove div (A X B) = B • curl A - A • curl B. 


4. The maximum electric field in a light wave is 0.1 volt per centimeter. 
Find how much energy is transported by the beam across 1 sq. cm. per 

6. Given a 40-watt lamp, and suppose that all its energy is dissipated in 
radiation of one wave length or another. Take a sphere of radius 1 m. 
surrounding it, and suppose the radiation is of equal intensity in all direc- 
tions. Find the maximum electric field in the radiation at this distance, 
in volts per centimeter, and the maximum magnetic field in gauss. Find the 
energy per cubic centimeter at this distance, in ergs per cubic centimeter. 

6. Apply Poynting's theorem to the case of a plane wave traveling in a 
conductor and show that the rate of dissipation of electromagnetic energy 
just equals the Joule heating. 

7. Calculate the electric and magnetic energies in a plane wave traveling 
in a metal and show by direct comparison that they are different from each 
other. What happens in the limiting cases <r—*0 and <r— *•<», i.e., insulators 
and perfect conductors? 

8. Investigate the behavior of n and k for a metal as functions of fre- 
quency, drawing curves. Take e = 1, and take the conductivity of copper. 
Note that the conductivity in electrostatic units has the dimensions of a 
frequency, and find in what part of the spectrum this frequency lies. Show 
that the value of e is only significant when the frequency becomes greater 
than a\ 

9. The significance of «■ as a frequency is found from the relaxation time, 
the time taken for a volume charge set up within a metal to die down to 1/e th 
of its original value. Derive this in the following manner. Set up the 
equation of continuity for the current density u and charge density p. In 
this, write u in terms of E by Ohm's law, and write the result in terms of p 
by the relation e div E = 4?rp. Solve the resulting differential equation 
for p, showing that the solution is p = poe~ t/r , where t, the relaxation time, 
is e/ixcr, so that a- is, as far as its order of magnitude is concerned, the fre- 
quency connected with the relaxation time. 




According to the electromagnetic theory of light, light con- 
sists of electromagnetic waves, propagated according to Maxwell's 
equations. We have already seen how we are led to the wave 
equation for E and H, or for the potentials, and we have investi- 
gated the plane wave solutions of these equations, showing that 
E and H are at right angles to each other and to the direction of 
propagation, the latter being the same as the direction of Poyn- 
ting's vector, giving the energy flow. We shall now investigate 
the electromagnetic theory of some simple optical phenomena, 
beginning with reflection and refraction. 

163. Boundary Conditions at a Surface of Discontinuity. — 
We have seen in the last chapter the conditions that hold for a 
wave in a refracting medium, whose index of refraction is con- 
stant. In the problem of reflection and refraction at a boundary 
between two media, however, the index changes suddenly from 
one medium to the other, and we must investigate what happens 
there. Let us assume that the boundary is a plane normal to the 
z axis. Then we shall apply Maxwell's equations, in the inte- 
grated form, to small regions containing the boundary. Thus 
take a thin flat volume, its faces parallel to the boundary and 
containing it. Let the area of the face be A. Apply to the 
above the divergence theorem, div D = 4wp, or ///div D dv = 
ffD n dS = 47r<7, where q is the total charge within the volume. 
The surface integral comes almost wholly from the flat faces; it 
is A(D n2 — Dni), if D 2 is the value of D in the upper medium, Di 
in the lower. If now the surface is uncharged, q gets smaller 
and smaller as the volume becomes thinner, so that in the limit 
A(D n2 - Dm) = 0, or D„ 2 = D nl . That is, the normal com- 
ponent of D is continuous at an uncharged surface. 

Next let us apply the curl equations, to contours of the follow- 
ing sort: infinitesimal contours of long thin shape, in which one 
long side is in one medium, the other in the other, parallel to 



the surface, and the parts of the contour which cross over from 
one medium to the other are of negligible length compared with 

the long sides. Consider curl H = - — H > or integrated, 

° c at c 

Cll 8 ds = f [(\^f+ ^~) dS - If there is no surface cur- 
rent, D and u are finite vectors, so that as the contour gets narrower 
and narrower, and the area smaller and smaller, the right side 
of this equation will vanish. The left side approaches (H a2 — 
H„i)L, where L is the length of the contour, H s i and H s2 are 
the tangential components of H in the media 1 and 2, respectively. 
Thus finally we have H s2 = H sl , or the tangential component 
of H is continuous. Similarly we show that the tangential 
component of E is continuous. 

Now we can see how to solve a problem involving two media 
separated by a plane surface, as air and glass. In one medium, 
we assume a plane wave approaching the boundary. But it 
must stop at the boundary, for the same plane wave, with the 
same wave length, would not be a solution of the problem for the 
second medium. There must be some wave in the second 
medium, however, for otherwise the boundary conditions could 
not be satisfied. Thus we are led to the existence of the refracted 
ray. As a matter of fact, we find that we cannot satisfy the 
boundary conditions without an incident, refracted, and also 
a reflected ray. By using all these, with proper relations between 
direction, amplitude, etc., we can actually satisfy the boundary 
conditions at the surface of separation of two media. 

164. The Laws of Reflection and Refraction. — Assume a plane 
wave in the first medium, striking the surface of separation. 

This wave will have the form e **'v » '. Let the surface 

of separation be given by z = 0, the xy plane. Further let 
the axis be so chosen that the wave normal is in the xz plane, 
as in Fig. 43, so that m = 0. Then at points of the surface of 

2Triv(t -) T . - . 

separation the disturbance is given by e >. »/. it is tins 
disturbance which, taken together with the corresponding 
expressions from the reflected and refracted waves, must satisfy 
certain boundary conditions. 

Next we consider a possible refracted wave. It will be in 

2 '(t l ' x + m 'v+ n ' z \ 

general of the form c *n v ' ' , so that in the surface 



of separation it will reduce to the value of this with 2 = 0. 
The boundary conditions must be satisfied for all values of x, y, 
and t, and yet we have only one constant at our disposal, an 
amplitude, in addition to the frequency and direction. It is 
obvious that the only possibility of satisfying the conditions 
will come if we make v' = v, V '/v' = l/v, m' = 0. For then we 
shall have just the same function of x, y, and t for both incident 
and refracted waves, at all points of the boundary. First, 

then, the refracted wave must 
have the same frequency as the 
incident one. Next, if the inci- 
dent wave normal is in the xz 
plane, this must also be true of 
the refracted wave. Finally, 
there is a relation between the 
angle of incidence and the angle 
of refraction. We have I = 
cosine of the angle between the 
wave normal and the x axis = 
sine of the angle between the 
wave normal and the normal to 
the surface = sin i, where i is 
the angle of incidence. Simi- 
larly, V = sin r, where r is the angle of refraction. Thus we have 

Fig. 43. — Law of refraction. 

sin 7v 

— — = -t = index of refraction of the second medium with 

sin r v 

respect to the first. In other words, we have the ordinary law of 
refraction, as a necessary consequence of the boundary conditions. 

Similarly, for the reflected wave, moving in the first medium, 
•we see that m must be equal to zero, and I equal to the value 
for the incident wave, showing that the angle of reflection equals 
the angle of incidence. Now the reflected wave must be different 
from the incident wave, and to do this we must have the n for 
the reflected wave the negative of the value for the incident 
one, showing that the reflected wave travels away from the 
surface rather than towards it. 

165. Reflection Coefficient at Normal Incidence. — After prov- 
ing the laws of reflection and refraction, we still have much 
more to do to apply the boundary conditions. For we must 
compute the values of the various vectors at the surface, and 
actually satisfy the conditions. Let us take first the simple 



case of normal incidence, where I = 0, and all waves travel 
along the z axis. Let us suppose that in the incident beam we 
have E along the x axis, H along y. For simplicity we assume 
the first medium to have the index of refraction unity, the second 
the index n = s/l. Then in the refracted wave we assume 
that E is along the x axis, H along y, and that the value of E 
is E', so that H' = nE'. In the reflected wave, assume that 
E has a changed phase, H not, so that E is along —x,H along y, 
and each numerically equal to E" . The change of phase of one 
vector and not the other is necessary to reverse the direction 
of the Poynting's vector. 

Now we may apply the boundary conditions. All normal 
components are zero, so that these conditions are automatically 
satisfied. For the tangential component of E, we have E — 
E" = E'; for the tangential component of H , H + H" = H' '. 
The latter is then E + E" = nE'. Combining the two, we have 
op E'(n — 1) 

! once E' = -=^ (by adding), and E" = — ^ L (by sub- 

-pin ~i i 

tracting), leading to -^ = , . - This gives us directly the 

reflection coefficient at normal incidence. The ratio of reflected 
to incident intensity is proportional to the ratio of the squares of 

( n _ 1)2 

the amplitudes, or K ' • This shows that the reflected 

intensity is never so great as the incident, but that the ratio 
approaches closer and closer to unity as n becomes larger. It 
is interesting to compute the reflection coefficient for familiar 
substances. For instance, for glass, n is about 1.5, so that the 
coefficient is (0.5/2.5) 2 = 1/25, showing that only a few per 
cent of the intensity is reflected from a glass plate at normal 

> We can check the energy relations: the amount of energy 
brought to the surface per unit time in the incident wave should 
equal the amount carried away in the refracted and reflected 

waves. The first is £-(E X H), whose magnitude is -^-E*. The 

reflected energy is j- , T iL ^ 2 - The refracted intensity is 
4ir (n + l) z 

UE' X H') = -%-nE'* = -£- . ^Ix^ . The sum of the 
4t ' 4t 47r (n + l) 2 



refracted and reflected intensities is 



(n — l) 2 + 4/i 

]* - hF 

{n + l) 2 

equal to the incident intensity. 

166. Fresnel's Equations. — Now we pass to Fresnel's equations, 
the extension of the last section to an arbitrary angle of incidence. 
Here, for the first time, we meet the question of polarization. 
The vector E is at right angles to the direction of propagation, 
but that does not fix the direction uniquely, and it is said that 
the wave is polarized in a particular direction if its electric vector 
points in that direction. Let us then consider the two extreme 


Fig. 44. — Vectors in reflection and refraction. 
Case 1. y axis points down into the paper. E and E' point down, E" points 
p. _ , 

Case 2. H, H', H" all point down. 

cases. We take the wave normal of the incident wave to be in 
the xz plane, as before. Then we consider the case where the 
electric vector is along the y axis, and the case where it is in the 
xz plane, as in Fig. 44. 

Case 1. Electric vector along the y axis. All vectors depend 
on space in the following way, rewriting I, m, n in terms of the 
angles of incidence and refraction : for the incident wave, 

for the refracted wave, 

„ . /. x sin r-\-z cos r\ 
2mv\ t— ; I 

for the reflected wave, 

2iriv(t — 

0. \ 

a; Bint— zcos i 




We take E and E' to be along the y axis. Then H is in the xz 
plane, at right angles to the wave normal. That is, for the 
incident wave, H x = —E cos i, H z = E sin i. Similarly, in 
the refracted wave, HJ = —nE' cos r, HJ = nE' sin r, and for 
the reflected ray HJ' = —E" cos i, HJ' = — E" sin i. Hence, 
we have the following relations : 

Normal component of D : nothing, since D is tangential. 

Normal component of B : E sin i — E" sin i = nE' sin r. 

Tangential component of E: E — E" = E'. 

Tangential component of H: —E cos i — E" cos i = —nE' 

cos r. 

sin % 
Remembering that — — • = n, the first two equations reduce to 

the same equation, E - E" = E' . The last is E + E" = 

n cos r E' _. tan i „ ,, . , ... , . ,, _ , 

: — = E From this at once, multiplying the first 

cos i tan r 

by 7 > and subtracting, we have 

jJtoti _ A = Jten.' \ 
\tanr / \tanr / 


tan i — tan r sin i cos r — cos i sin r 


tan i + tan r sin i cos r + cos z sin r 

E" sin (* - r) 

E sin (t + r) 


This gives the amplitude of the reflected wave, and is one of 
Fresnel's equations. We note that as i and r become zero, the 
law of reflection becomes i/r = n, i = nr. Thus in the limit 
of normal incidence, the ratio approaches {nr — r)/(nr + r) = 
(n — l)/(n + 1), as we found above. We also note, in the 
other extreme of tangential or grazing incidence, that i = 90 

deg., so that the ratio is — — ,._ , g ' , — { = 1. That is, the 

sm (90 deg. + r) ' 

reflection coefiicient equals unity for grazing incidence. The 

formula gives a gradual increase of amplitude as the angle of 

incidence increases. 

Case 2. Electric vector in the xz plane. Let H be along the 
y axis in all the waves: H y = E, H y ' = nE'; H y " = E". Then 
we take E x = E cos i, E z = —E sin i, EJ = E' cos r, EJ = — E' 
sin r, E x " = —E" cos i, E z " = —E" sin i. Then we have: 

Normal component of D: —E sin i — E" sin i = —n 2 E' sin r. 


Normal component of B : nothing. 

Tangential component of E: E cos i — E" cos i-— E' cos r. 
Tangential component of H : E + E" = nE'. 
Using the law of refraction, the first and last are the same, 

E + E" = nE'. The other is E - E" = E'^—.- Multiplying 

COS % 

the first bv ■> the second by n = -. — > and subtracting, we 

J cos i sm r 


/ cos r _ sin A _ ™//_ cos r _ sin_A 
\cos i sin r) \ cos * sin r/ 


i£" _ cos r sin r — cos i sin i 
^ ~ cos r sin r + cos i sin i 

Now we see at once that 

sin (i ± r) cos (i + r) — 

(sin i cos r ± cos i sin r) (cos i cos r + sin i sin r) = 

sin i cos z (cos 2 r + sin 2 r) ± sin r cos r (sin 2 i + cos 2 i) = 

sin i cos z ± sin r cos r. 

Hence we have 

E" _ sin (i — r) cos (t + r) __ tan (t — r) ,_. 

1 sin (i + r) cos (i — r) tan (t + r) 

This is the other of Fresnel's equations. 

167. The Polarizing Angle. — In Case 2 of Sec. 166, where the 
electric vector is in the plane of incidence, or the xz plane, we 
notice an interesting fact. If i + r = 90 deg., a perfectly possi- 
ble situation, we have tan (i + r) = « , so that E"/E = 0. 
That is, the amount of reflected light, at this angle, is zero. 
There is no such situation for the other sort of polarization. Sup- 
pose, then, that we take an unpolarized beam, such as would, be 
emitted by any ordinary source, and reflect it from a mirror at 
this angle, called the polarizing angle. The reflected light will 
consist entirely of the light polarized with the electric vector 
at right angles to the plane of incidence. It was by this phe- 
nomenon that polarized light was first discovered. Light was 
reflected from one mirror at this angle. Then its polarization 
was found by reflecting from a second mirror at the same angle. 
As the second mirror was rotated about the beam as an axis, 
so that the Dolarization changed from being at right angles to 


the plane of incidence to being in the plane, the doubly reflected 
beam changed from a maximum intensity to zero. 

The polarizing angle r' is fixed by i' + r' = 90 deg., and this 
occurs when cos i' = sin r' . Using the law of refraction, we find 
tan i' = n, thus fixing the definite angle i' . For glass the angle 
of polarization is 56 deg. 

168. Total Reflection. — For light passing from a dense medium 
with index of refraction n to a vacuum of index 1, the law of 
refraction is n sin i = sin r. For the angle of incidence given by 
sin i = 1/n, we have sin r = 1, r = 90 deg., and the refracted 
ray emerges at grazing incidence. For larger angles of incidence, 
sin r is greater than 1, and there is no real angle r. Physically 
we know that at these angles, greater than the critical angles, 
there is total reflection, with no transmitted beam. We can 
easily investigate the situation mathematically. 

In the first place, let us consider the disturbance in the second 
medium, for we find there is a disturbance, even though no trans- 
mitted beam is observed. This is given by an exponential 

_ . /. xainr + z cosr\ 

2wlv[ t 7— I 

e ^ ', 

where we remember that the second medium has index 1, velocity 
c. But cos r = ± \/\ — sin 2 r = ± \/—l-y/n 2 sin 2 i — 1, a 
pure imaginary. Thus the exponential becomes 

where we have used the negative square root. The first term 
represents a wave propagated along the x axis, or parallel to the 
surface of the medium, with an apparent velocity c/sin r, a value 
less than c. The second factor indicates that the amplitude of 
this wave is damped out as z increases, or as we go away from the 
surface, so that the wave fronts (surfaces of constant phase) 
are at right angles to the surfaces of constant amplitude. This 
disturbance ordinarily damps out in a very short distance. Thus 
if n 2 sin 2 i is decidedly greater than 1, the exponential becomes 
small when z is a few wave lengths (vz/c a reasonably large num- 
ber). Consequently the disturbance is not observed. It is 
easily shown that Poynting's vector for this wave has no com- 
ponent normal to the surface, so that it does not carry any energy 


The reflected wave may be treated by Fresnel's equations. 
Thus, in Case 1 we have 

E" _ co s i sin r — sin i cos r _ a — ib 
~E cos i sin r -j- sin i cos r a + ib 

where a = cos i sin r, 6 = — sin i\/n 2 sin 2 i — 1. This ratio 
can now be written as — e - 2it * n ~ lb / a , so -that E" and E are of the 
same magnitude, showing that all the light is reflected, but they 
differ in phase. We may write 

E e ' 


Si -\/sin 2 i — 1/n 2 

tan -^ — : 

2 cos i 

Similarly for Case 2 we have 

E" _ cos i sin i — cos r sin r _ c — id 
157 cos i sin i + cos r sin r c + id 

where c = cos i sin *, d = — sin r\A* 2 sin 2 i — 1. Again all the 
light is reflected, but with a change of phase 5 2 given by 


^ gl'82 

8 2 n 2 \/sin 2 i — 1/n 2 

tan tt = r— 

2 cos z 

Thus, in the general case, where E has components both in the 
xz plane and along the y axis, there is a difference of phase between 
these components upon total reflection, and linearly polarized 
light in general will become elliptically polarized upon total 
reflection. To see this, we note that two vibrations at right 
angles, with the same frequency and phase, produce a resultant 
vector whose extremity moves in a line (plane polarization), but 
if the two components are in different phases the extremity of the 
vector traces out an ellipse. If the phases differ by 90 deg., 
and the amplitudes are equal, the polarization is circular. 

It follows from our expressions for 8 X and 5 2 that the difference 

between these phase angles, which we denote by 8, is given by 

the relation 

8 cos i\/sin 2 i — 1/n 2 
tan k = 

2 sin 2 i 


Only in the case of grazing incidence, i = tt/2, does 5 = 0, so 
that our above remarks hold valid except in this case. It is 
clear that by causing an elliptically polarized beam to be totally 
reflected at the correct angle, it can be transformed into a beam 
of linearly polarized light. 

169. The Optical Behavior of Metals. — We shall now examine 
the law of reflection for light falling on metals, restricting the 
discussion to the case of normal incidence. In the last chapter 
we have already shown that in the case of metals we must 
introduce a "complex index of refraction," n' = n — ik, where k 
is the extinction coefficient, and in so doing we retain the identical 
form of the relations which we have been using in this chapter. 
We have already found (/* = 1) 

n = V|(Ve 2 + 4(r 2 7^""rfe) 

k = Vkv^+i^t^ - e ) 

where <r is the conductivity, v the frequency, and e the dielectric 
constant, e is unknown for metals, but since <r for metals is 
so large (in e.s.u. a = 10 18 ), we can neglect e at least for light of 
sufficiently long period. Thus we find 

n = k = y/a/v (7) 

relations first found by Drude. For the infra-red, s/a/v > > 1. 
We may still use Fresnel's equations as we have done for total 
reflection. For normal incidence these are simply 

WL = n ' ~ 1 . 
E ri + l' 

and we must insert a complex value of n' for reflection from 
metals. Thus we have 

E" = E n ~ 

n + 1 — ik 

and taking the square, we find for the ratio of the reflected 
to the incident intensities, 

„ = ft 2 + k 2 - 2n + 1 = \n - l) 2 + fc 2 ' . 

n 2 + F + 2n + l (n + l) 2 + fc 2 ' W 

R is known as tHe reflective power of the metal. Since n = k, 
we may write 

< 7P = -, 4n 

■ 2n 2 + 2n + 1 


and since n = \/<t/v > > 1, this becomes 





R = 1 - -r=r < (») 

This relation holds experimentally in the far infija-red, down to 
X ^ 5a*. The reflective power varies with the cQlor of the inci- 
dent light, and colors which are strongly absorbed are also 
strongly reflected. 

Problems I 

1. Light is reflected from glass of index of refraction 1.(5. Compute and 
plot curves for the reflected intensity as a function of angle, for both sorts of 
plane polarization. 

2. Find the intensity of light in the refracted medium, lor arbitrary angle 
of incidence and both types of polarization. Show th^it the amount of 
energy striking the surface is just equal to the amount carried away from it. 
Note that the amount striking the surface is computed, not from the whole 
of Poynting's vector, but from its normal component. 

3. Show that the reflection coefficient from glass to air at normal incidence 
is the same as for air to glass, but that the phases of the reflected beams are 

4. Light passes normally through a glass plate. Find; the weakening in 
intensity on account of the reflection at the faces. 

5. Ten plates of glass of index 1.5 are placed together and used as a 
polarizer. Light strikes the plates at the polarizing angle, and the trans- 
mitted light is used. Since all the reflected light is of on^ polarization, and 
the reflections at both surfaces of all plates are enough to Remove practically 
all of the light of this polarization, the transmitted light Will be practically 
polarized in the other direction. Find the intensity of both sorts of light 
in the transmitted beam, assuming initially unpolarize4 light, and hence 
show how much polarization is introduced. You may; have to consider 
multiple internal reflection. 

6. Derive the expressions for tan Si/2 and tan 5 2 /2 in the paragraph on 
total reflection. 

7. Derive the formulas for the phase difference 5 of! the two reflected 
components of E in the case of total reflection. 

8. The conductivity of copper in e.s.u. is 5 X 10 17 per second. Calculate 
the reflective power of copper for wave lengths of light jX = 12j* and X - 
25.5/*. The observed values of 1 - R are 1.6 per cent ai|d 1.17 per cent at 
these wave lengths. 

9. Consider light linearly polarized so that the incident electric vector 
has equal components in the plane of the wave normal and) the perpendicular 


thereto. If this light falls on a metal, using Fresnel's equations find the 
ratio of the reflected components of E. If this ratio is written as pe*'« show 

„ 1 — pet's ^ sin i tan i 

1 + peis ~ yV 2 -sin 2 / 

where i is the angle of incidence and n' the complex index of refraction of 
the metal. 



Maxwell's theory and Maxwell's equations are based on the 
assumption of dielectrics with dielectric constant e, magnetic 
substances with permeability m, conductors with conductivity <r. 
These assumptions are unsatisfactory for two reasons. First, 
cases are known, and in fact are usual rather than exceptional, 
in which the three constants mentioned are not really constants. 
Thus the permeability of iron depends on the field strength. 
The dielectric constant of almost all substances depends on the 
frequency; as we have seen, the index of refraction n is given 
by the relation n = \/~e, and the well-known phenomenon of 
dispersion shows a dependence of refractivity on wave length 
or frequency. An extreme case is water, whose index of refrac- 
tion in the visible is about 1.4, and whose dielectric constant is 80, 
a result of the fact that the dielectric constant is measured 
for static fields, and that n as a function of frequency goes from 
V80 at zero frequency, through a region in short radio or long 
infra-red waves in which the index greatly decreases, so that 
with the very high frequency of visible light it is reduced to 1.4. 
The second reason why Maxwell's assumptions are unsatisfactory 
is that, since matter is known to be composed of electric charges, 
electrons with negative charges and atomic nuclei with positive 
charges; it ought to be possible to explain these typically electrical 
properties of matter directly in terms of the electronic structure, 
without having to resort to empirical relations of the sort implied 
by a constant or variable dielectric constant. The attempt 
to derive the. electrical properties of matter from the electron 
theory was first made by H. A. Lorentz, and he was successful 
not only in explaining the physical meaning of the dielectric 
constant, permeability, and conductivity, but in deriving their 
dependence on frequency, field strength, etc. Further develop- 
ments of the theory, making use particularly of wave mechanics, 
have carried the subject much further than Lorentz was able to. 




and in our later chapters on wave mechanics we return to these 

170. Polarization and Dielectric Constant. — The fundamental 
physical fact about a dielectric is that, when placed in an electric 
field, it acquires surface charges on its faces, proportional to the 
strength of the field. Thus in Fig. 45, a slab of dielectric is 
shown with positive and negative surface charges, as if the posi- 
tive had actually been pulled along to the face by the action 
of the field, the negative pushed to the other face. These surface 
charges, of course, contribute to the field, just as do other charges, 
which we actually have control over. 

The essence of the electron theory is that it 
treats these induced surface charges in the same 
way as any other charges, applying the ordinary 
Maxwell's equations to all charges in existence, 
and not considering dielectrics as being essen- 
tially different from free space, except in so far 
as they contain these polarizable electrons. 
Thus, if p and u are charge density and current 
density, respectively, of the so-called "real 
charge" which we can move about at will, and 
p p and u p the charge and current density of the 
charge arising from polarization, we assume Maxwell's equations 
for a nonmagnetic medium are 

. „ 1 BE . Air, , , . „ 1 dH 
curl H = - — + —{u + u p ), curl E = — 

C dt C C dt 

Fig . 45. — Polariza- 
tion of dielectric. 

div E = 4tt(p + P p), div H = 0. (1) 

In other words, we assume that the field E is the field of all 
charge, both "real" and polarization charge, and that the total 
current resulting from both sources produces the magnetic field. 
The polarization charge must be produced, from the Originally 
uncharged dielectric, by the motion of positive charges in the 
direction of E, and of negative charge in the opposite direction. 
Suppose that in equilibrium two equal charges of opposite sign 
lie so near together that they exert no appreciable external effect. 
By means of an external field these charges may be displaced 
relative to each other by a distance r. The charges then form a 
dipole of moment 

p = er. 


In producing such a dipole there is clearly a current 

dr dp 

e-7-=ev = -Tj- 

dt at 

If we add the dipole moments of all the polarization electrons 

in a unit volume we obtain the polarization vector, or the dipole 

moment per unit volume 

P = Sp, (2) 

and a current density due to these electrons equal to 

u p = ppVp = — • (3) 

In producing dielectric polarization, charges cross a surface in 
the body. In fact all the charges pass across the surface which 
originally were contained in a cylinder of base equal to the 
surface and length r. If r» is the component of r normal to 
the surface, then we have as the charge passing through the 
end dS 

Ver n dS = P n dS (4) 

which is the surface charge appearing on dS if this is an element 
of the outer surface of the body. If we consider a closed surface, 
the enclosed volume loses the charge 

JjP n dS = JJJdiv P dv 
according to Gauss's theorem. The density of polarization 
electrons remaining is given by p P = — div P, since these have 
the opposite sign to those which have crossed the surface. We 
thus can write both polarization charge and current in terms 
of the polarization vector P. 

We have seen that the field E is that resulting from all charge, 
including the polarization charge. The displacement D, how- 
ever, is simply the field resulting from the real charge p, so that 
div D = 4rp. To get Maxwell's equations in terms of D, we 
take Eqs. (1), and make the substitutions 

p P = — div P, 

Up = -77) {O) 

which, as we note, obviously satisfy the continuity equation 
for polarization charge and current. Then we have at once, 
for the only two equations affected by the change, 

curl H = i | (E + 4xP) + ^ (6) 

C Of v 



div (JE + 4ttP) = 4rrp. 

If we set D = E + 4xP, these become the ordinary Maxwell 

171. The Relations of P, E, and D .— We have seen that E 
measures the field of all charge, D that due to the "real" charge, 
and that P is the polarization per unit volume. To understand 
P better, we may take a unit cube of dielectric, one pair of faces 
being perpendicular to the field. Since the polarization surface 
charge is P„, one of these faces will have a charge on its unit 
area of |P|, the other of — |P|, so that the dipole moment of the 
cube, coming from these two charges at unit distance apart, 





e t 














,. * 

















a b c d 

Fig. 46. — Condenser containing dielectric. Condenser plates a and d have 
surface charges ±<r. Induced surface charges are shown on faces b and c of 
dielectric. The force on unit charge within cavity e is E, and within cavity / is 

would be P. Similarly, if the volume had had length L parallel 
to the field, area A in the plane at right angles, the charges 
on the ends would be ±PA, and the moment, remembering that 
these are a distance L apart, is PAL, or P times the volume, 
showing that the moment is proportional to the volume, so 
that it is really correct to regard P as the moment per unit 

The relations of the three quantities are perhaps best under- 
stood from a simple illustration in the theory of the condenser. 
In Fig. 46 we have a condenser consisting of two parallel plates 
a and d with surface charges +<r, respectively. Between them 
there is a slab of dielectric be, with surface charges ±P, on the 
faces c and 6, respectively. The field E now is determined from 


the whole charge ; that is, using our relation regarding the rela- 
tion of discontinuity of field to surface charge, the field within 
the dielectric is given by 

E = 4tt(o- - P). 

The displacement D, however, is determined only from cr, so 

D = 4x0- = E + 4ttP. (7) 

The capacity of the condenser is given by the charge, D/4n, 
divided by the potential difference, E times the distance L 

between the plates, or is r= -— =•• If we define the dielectric 

constant e as the ratio D/E, this leads correctly to the relation 
that the capacity of the condenser is e times the capacity of the 
same condenser with vacuum in place of the dielectric. 

Let us now consider the meaning of the field within the dielec- 
tric. Actually, on account of the atomic and electronic struc- 
ture, the field will change rapidly from point to point, so that 
it is not so easy as it might seem to define it. The usual method 
is to set up a long needle-shaped cavity e, pointing in the direction 
of the field. A point charge placed within the cavity would now 
be acted on by just the field of real and polarization charges, 
so that the field E is the force on unit charge in such a cavity. 
The necessity of choosing that particular shape of cavity is 
shown by considering the cavity/, which is supposed to be disk- 
shaped, with its flat face perpendicular to the field. This 
cavity will have surface charges ±P set up on its two faces, 
and it is evident that the lines of force starting from the polariza- 
tion surface charges on plates b and c will terminate on these 
faces of the cavity, not crossing it at all, so that the field within 
it will come wholly from the real surface charges on a and d, 
or will be E + 4tP = D. Thus if we choose we may define E 
as the field in a cavity shaped' like e, in which the effect of the 
charges on its faces is negligible because the faces are of negligibly 
small area and arbitrarily far from the point where we are 
finding the field, while we may define D as the field in a cavity 
shaped like /. These definitions were originally used for the 
corresponding magnetic case, by Kelvin. It is interesting to 
notice that the fields in cavities of other shapes are different, 
depending on the shape of the cavity. Thus in a later section 
we shall see that the field in a spherical cavity is E + (47rP/3). 


We notice finally that since D = eE = E + 4rP, we have 
« = 1 + (4nP/E), a constant if the polarization is proportional 
to the field. To compute the dielectric constant, or refractive 
index, we have then to find the polarization, per unit field, and 
we proceed to do this for gases, and later for solids. 

172. Polarizability and Dielectric Constant of Gases. — In 
gases the molecules or atoms are relatively so far apart that we 
can neglect the interactions between them. Each molecule 
contains charges which can be displaced under the action of an 
external field, and these charges act as if they were held to posi- 
tions of equilibrium by restoring forces proportional to the 
displacement. Thus in a static case an electron e is acted on 
by the forces eE of the external electric field, and — ex the linear 
restoring force. The displacement is then x = (e/c)E, and 
the induced dipole moment ex = (e 2 /c)E. The ratio e 2 /c, giving 
the dipole moment set up by unit field, is called the polariza- 
bility, denoted by a. Thus the dipole moment per molecule is 
aE, and if there are N molecules per unit volume the polarization 
P is NaE, so that e = 1 + 4nNa. 

A very simple model of an atom will give us the order of 

magnitude of the polarizability. The atom consists of a nucleus 

of charge Ze, where Z is an integer, e the magnitude of the charge 

on the electron, surrounded by a distribution of negative charge 

equal to — Ze. In the external field the negative charge will 

be displaced with respect to the nucleus. The restoring force 

may be computed as if the negative charge filled a sphere of 

radius R with uniform charge density. Then the positive 

charge Ze, at distance r from the center, would be acted on by 

a force as if the negative charge within a sphere of radius r 

were concentrated at the center, all other negative charge being 

neglected. This charge would be r 3 /j£ 3 times the total charge, 

(Ze) 2 r 
so that the force would be ' • The polarizability is then 

found to be R 3 , proportional to the volume of the molecule. 

173. Dispersion in Gases. — We now assume a sinusoidal 
external field of frequency v, as in a light wave. The magnetic 
force on the electron on account of its motion can be neglected. 
In addition to the external electric force, and the elastic restoring 
force, we introduce a damping force proportional to velocity, 
to account for absorption. The equation of motion for the 
electron is then, for the x coordinate, 


mx + mgx + <a 2 mx = eE x °e i(at (8) 

where we have placed w = 2irv. Thus we have the problem of 
the damped linear oscillator in forced oscillation. We have 
solved this problem in Chap. IV, and can write for the steady- 
state solution 

~E x e^ 1 -E 

m m (9) 

wo 2 — co 2 + iwg coo 2 — co 2 + iwg 

in complex form. This shows that the electron vibrates at 
the same frequency as the light wave but with an amplitude 
depending on the frequency and out of phase with the light 
wave. If we have N electrons per unit volume characterized 
by the constants o) k and g k (electrons of the fcth kind) we get for 
the dipole moment per unit volume : 

-2— -2*= 


lo) k 2 — co 2 -H iugk 


whence we get for the displacement vector 

D = E + 4xP = e[ 1 + 4^2 

A7 e 


- w 2 + icogkl 


and if we introduce a "dynamic" refractive index (n — ik) 
denned by D = eE = (n - ik) 2 E, we find 

(n - iky = i + 4*2;?-= 

N k - 

m (io) 

I (ak 2 — co 2 + ioigk 


so that the index of refraction is a function of the frequency of 
the light, and different colors traveLwith different velocities. This 
is known as the dispersion of light. Furthermore, in general, 
the index of refraction is a complex quantity and, as we have seen 
in our discussion of electromagnetic waves in metals, this indi- 
cates absorption and is not surprising in view of our introduction 
of a damping force. 

In the limit of slow frequencies (long wave lengths of light 
where «. « co*) we may neglect the last two terms in the 
denominator and find 



- . - 1 +-4r.2-? 

as the static value of the dielectric constant of insulators, agreeing 
with the value found in the last section^ 

If the frequency of the light does not lie near any of the 
natural frequencies of the electrons, we may neglect the frictional 
force and find a real index of refraction given by 

Ar e 2 

n = 1 + 2^-^ 

COk — CO 


if we remember that the index of refraction for gases varies but 
slightly from unity. Thus there is no absorption and we have 
the case of normal dispersion. JLet us consider the index of 
refraction as a function of frequency of the light in the visible 
region of the spectrum. If the natural frequencies of the elec- 
trons lie in the ultra-violet (and also for the case that they lie 
in the infra-red) the index of refraction increases with increasing 
frequency, the normal behavior. 

In case the frequency of the light lies near a natural frequency, 
we obtain the phenomenon of anomalous dispersion. In this 
case the frictional term becomes , important and we find an 
absorption band in the neighborhood of co . The whole discus- 
sion is similar to the case of a resonant electric circuit. For 
simplicity let us assume only one resonant frequency. Remem- 
bering that for gases n is very nearly unity, we have : 

e 2 N 

n — ik = 1 -f 2x = . 

m co — or + iu>g 

and if we separate into real and imaginary parts, we obtain, 


C0 — 0} 

n = 1+ 2tt— 2 " (11) 

m (coo 2 — w 2 ) 2 + cc 2 g 2 


k = 27r ^r(c 2 -co 2 ) 2 + coV' (12) 

n is known as the principal index of refraction and k the absorp- 
tion coefficient. If we plot n — 1 and k against the light fre- 
quency, we get curves of the form shown in Fig. 47. Such 



curves have already been considered in Prob. 10, Chap. IV. 
In the neighborhood of the absorption region we see that the 
index of refraction decreases with increasing frequency and this 
is the anomalous behavior giving rise to the term anomalous 

Fig. 47. 

-Anomalous dispersion, showing index of refraction and absorption 
coefficient as function of frequency. 

174. Dispersion of Solids and Liquids. — In the case of solids 
and liquids we may no longer make the approximation that the 
force acting on an electron is simply the electric vector of the light 
wave in free space, but must take into account the added force 
on the electron due to the polarization of the body. We can 
calculate this force as follows: we imagine a small sphere of 
radius R (with its center at the position of the electron in ques- 
tion) cut out of the medium. If we do this, we have induced 
charges on the surface of this spherical volume from which we 
calculate the force at the center of the sphere. We have for 
the surface density of induced charge on a spherical ring at 'an 
angle 0, a = P n = P cos 0, as in Fig. 48. The area of the ring 
is 2rrR sin • Rdd = 2wR 2 sin dd, so that the charge on this 
ring is 

2tP# 2 cos sin dd. 


This charge produces a field at the center of the sphere whose 
component parallel to E is 

,-, 2ttPR 2 cos 2 6 sin dd 
. dE,= —^ 

so that the total charge on the sphere produces a field at the center 
equal to 


Ei = 2xP I cos 2 sin 6 dd = 

Jo 3 

The total electric field at the center of this sphere is then 

E + ~ ^ ~ (13) 

Of course, there is still the contribution to the force by the atoms 
inside the little sphere we have cut out, but in an isotropic medium 

Fig. 48. — Field in spherical cavity in dielectric. 

this averages zero. We can now carry over our calculations for 
gases if we replace E by E + (4xP/3) in the expression for x. 
Thus we get 

N k — 
P = [E + ™ '-' m 

-(* + ¥)2 

wV — oj 2 + iiog k 


and using the relations D = eE = E -f 4xP, we have 
B + **-•-+** 

and we find for e 

e-1 (n - iky - 1 4*->w? " k m 

N k - 

e + 2 (n — t'A;) 2 + 2 3 ^J« 2 ofc — co 2 + iwgr fc 

2 ~ 3 ^lw 2 n fe - 

If N represents the number of atoms, then 

N k =f k N 


and fk gives the number of electrons of the kth kind per atom, 
the so called "oscillator strength," and we have 

(n — ik) 2 — 1 1 471--%-^ , e 2 /m 

471-^-71, e* m /1>t v 

= T >/*"! V_i_ • ( 14 ) 

{n-iky + 2 N 3 ^r*co 2 OJk - co 2 + iwflf* 

In all cases of transparent substances, where we can neglect the 
damping force, and the index of refraction is real, we have for a 
given frequency of light : 

n 2 — 1 1 
, n — = constant. 

n 2 + 2 po 

where p is the density of the body, obviously proportional to N. 
This law, known as the Lorenz-Lorentz law, is surprisingly well 
obeyed for many substances. Of course, in the limit of very long 
electromagnetic waves, and for the electrostatic case, 

— — - • — = constant, 

€ + 2 po 

giving us a relation between dielectric constant and density. 

If we use the expression E + (4rP/3) instead of E in the equa- 
tion of motion of an electron, we find similarly to our equation 
for gases: 

-^^ Nk e 2 /m ., „. 

(n - »)• = 1 + 4,^ as _ ,, + ^ (15) 

with the only difference that instead of the natural frequency 
coofc of the electrons we find the apparent natural frequencies 

«>* = " 2 o* - jN^. (16) 

Thus we have the same type of anomalous dispersion phenomena 
in solids and liquids that we have in gases. 

175. Dispersion of Metals. — In metals we picture free electrons 
wandering about among fixed ions, and these electrons are the 
conduction electrons. On the average there is no resultant force 
on the electrons, so that under the influence of an external field 
we can place the force on an electron equal to eE. If we imagine 
the ions as rigid structures possessing no polarizability, we then 
have the simplest possible picture of a metal. Thus, in contrast 
to the bound electrons of the previous sections, we have no restor- 
ing force on these electrons. We must, however, introduce a 


damping force, so that steady-state motion becomes possible. 
Thus we have as the equation of motion of conducting electrons: 

mx -f- mgx = eE. (17) 

This equation must allow an atomistic calculation of the con- 
ductivity and if the external field E is constant and in the z 
direction, we get as the steady-state solution of this equation 

x = — / + constant. 

Thus the velocity is x = eE/mg, and if TV is the number of con- 
ducting electrons per unit volume, we get for the current density 

,, . Ne 2 E 

u = Nex = 


Now by Ohm's law u = aE, we find 

Np 2 
a = — (18) 


so that we are led to an expression for conductivity from an 
atomic point of view. It is interesting to note that dimensionally 
a and g are both of the dimensions of frequencies. We have 
already seen in Prob. 10, Chap. XXII, that the period associated 
with <r is the relaxation time, the time taken for any irregularity 
in charge distribution within the metal to decrease to 1/eth. 
of its value, and have seen that this frequency, for good conduc- 
tors, is in the ultra-violet part of the spectrum. The meaning of 
g is similar, as one could see by imagining an electron initially 
with a given velocity, and finding the time taken for its velocity 
to decrease to 1/eth of its initial value, the result being essentially 
the period associated with g. It seems very reasonable to sup- 
pose that approximately equal times would be required for the 
velocity of electrons to be damped down, and for charge irregu- 
larities to be ironed out, and, as a matter of fact, g is found to 
be of the same order of magnitude as a. One can estimate g 
by making a guess as to the value of N, the number of free elec- 
trons per unit volume, assuming, for instance, that there is one 
free electron per atom, and then computing g from the equation 
g = Ne 2 /ma. One has, then, two independent constants charac- 
terizing the optical behavior of a metal, so that complicated 
results are not surprising. In addition to this, metals like other 
substances contain polarizable electrons, which make additional 


The formulas for the optical constants of a metal may be found 
simply by including the free or conduction electrons as a class of 
bound electrons whose binding force, and natural frequency, are 
zero. Thus 

( ..v. 1 . 4rNe 2 /m ^i N k e 2 /m 

(n — i«) = 1 H r - -. — — 4- 47T > — « 8 . » 


n » _ k 2 = i _ ^ * . 

n * l g l + co 2 /g 2 + 

TO (co* 2 — co 2 ) 2 + (<ag*)' 


j, _ <r 1 , s^N k e 2 ugh nQ s 

Uk ~ v 1 + »»/0 f + ^ m W ~ co 2 ) 2 + (co<7*) 2 ' Uy; 

where in the last two we have written Ne 2 /m as <xg. The sum- 
mations are over the bound electrons. We notice that as the 
frequency becomes low compared with a, the first term in the 
product nk becomes very large compared with unity, masking 
the effect of the bound electrons. The difference n 2 — k 2 does 
not become correspondingly large, so that in the limit, as we 
stated in Chap. XXII, n becomes equal to k, and both approach 
■y/cr/v, neglecting co compared to g. It is easy to see that at low 
frequency n 2 — A; 2 approaches e, if in the dielectric constant we 
include a contribution —4x<r/g from the free electrons. However, 
it is only at low frequencies that these simplifications enter. As 
the frequency enters the near infra-red or visible region, it 
becomes of the same magnitude as a and g, so that the contribu- 
tions of the free electrons become complicated, and at the same 
time nk decreases so that the contributions of the bound electrons 
become important. It is thus natural that experimentally the 
curves of n 2 — k 2 and nk throughout the visible part of the 
spectrum are very complicated, though they can be fitted fairly 
accurately with formulas of the type we have derived, assuming 
bound as well as free electrons. In the ultra-violet, the frequency 
becomes too high for the free electrons to follow, the contributions 
of the free electrons become small compared to those of the bound 
electrons having resonance in that region, and a metal does not 
behave essentially differently .from an insulator. 

In conclusion, we should mention that the introduction of a 
frictional force proportional to the velocity of the electrons is at 


best an extremely rough approximation. In metals the steady 
state is made possible by collisions of the electrons with the ions 
of the lattice, and the energy of the electrons gained from the 
external field is thus transmitted to the lattice, excites lattice 
vibrations, and appears as heat, as we shall describe more in 
detail later. All in all, when one considers the approximate 
nature of this classical electron theory, it is gratifying that it 
checks as well as it does with experiment and assures us that a 
more refined atomic picture will lead to an exact theory. 


1. Show that in the case of normal dispersion for the visible spectrum 
where there is an absorption band in the ultra-violet, the index of refraction 
can be written as 

---* + £ + §+■••. 

where \ is the wave length in vacuum and A, B, C are constants. 

If there is also absorption in the infra-red show that the index of refraction 
is then given by 

»« =4+^+^ +....- A'\* - B'\* 

2. Measurements of Hi gas give the following values of the index of 
refraction : 

X in A. 

(n - 1 

















Using the expression in Prob. 1 for n 2 in reciprocal powers of X, calculate 
the best values of A, B, and C. If the measurements are made at room 
temperature and atmospheric pressure, calculate the resonant frequency 
wo and wave length from these constants. 

3. Prove that in the case of anomalous dispersion for gases the maximum 
and minimum values of n occur at the positions where the absorption coeffi- 
cient reaches half its maximum value. Show that the half width of the 
absorption band equals the damping constant divided by the mass of an 
electron. Assume g/o>o < < 1. 

4. For the D line of sodium the following values of the constants in the 
dispersion formula are found: 

wo = 3 X 10 1S ; g = 2 X 10 10 ; 4irNe*/m = 10 23 . 
Plot the index of refraction n and the absorption coefficient A; as a function 
Of the frequency of light. Find the maximum and minimum values of the 


index of refraction n. Find the maximum value of the absorption coefficient 
k and the half width of the absorption band in Angstrom units. 

5. Show that for gases the Lorenz-Lorentz law takes the approximate 

2 n — 1 

form 5 • = constant. The following measurements have been made 

o po 

on air (p given in arbitrary units), t 

po n 

1.00 1.0002929 

14.84 1.004338 

42.13 1.01241 

69.24 1.02044 

96.16 1.02842 

123.04 1.03633 

149.53 1.04421 

176.27 1.05213 

Calculate 5 • and — for each of these measurements and com- 

3 po n 2 + 2 po 

pare the constancy of the results (calculate to four significant figures). 

6. The indices of refraction for the sodium D line, and densities in grams' 
per cubic centimeter of some liquids at 15°C. are 



Carbon bisulphide. 
Ethyl ether 



Calculate the indices of refraction for the vapors at 0°C. and 760 mm. pres- 
sure. The observed values for the vapors are 1.000250, 1.00148, and 
1,00152, respectively. 

7. -The quantity , , , , is called the refractivity of a substance if m 

(n* -\- z)p 

denotes its mass. Prove that the refractivities of mixtures of substances 
equal the sum of the refractivities of the constituents. (Neglect damping 
forces from the start.) 

8. Show that the molecular refractivity of a compound, defined as ■ • 

n 2 -\- 2 


-'-, where M is the molecular weight, is equal to the sum of the atomic 


refractivities of the atoms of which the compound is formed. (Neglect 

damping forces.) 

9. Prove that the apparent natural frequencies «*, in the equation for 
the index of refraction for a solid or a liquid, are related to the natural fre- 
quencies mo for the electrons in isolated atoms by the equation 

_ „ , 4tt Nke 2 

cojr = Wao — -5- 

6 m 


10. For the following gases we have the following values of (n — 1)„ 
extrapolated to long wave lengths: 

Gas (» - 1), • 10 6 

H 2 136.35 

N 2 294.5 

2 265.3 

Calculate the values of (n — I), for the following gases: H 2 0, NH 3 , NO, 
N 2 4 , 3 . The measured values are 245.6, 364.6, 288.2, 496.5, 483.6, all 
times 10 6 . Find the percentage discrepancy between the calculated and 
observed values. 


Suppose that we have an electrical charge oscillating back and 
forth sinusoidally with the time. This charge will send out a 
spherical electromagnetic wave, radiating in all directions. 
There are several physical problems connected with such a wave. 
First, the phenomenon may be on a large scale, as in a radio 
antenna. Radiation from a vertical antenna, as a matter of 
fact, can be approximately treated by replacing the antenna by 
such an oscillating charge. But also on a smaller scale we can 
treat the radiation of short electromagnetic waves, or in other 
words light, from an atom which contains oscillating electrons. 
The electrons may have been set in motion by heat or bombard- 
ment, in which case we have the treatment of the emission of 
light from a luminous body; or they may be in forced motion 
under the action of another light wave, as in the case of the scat- 
tering of light. As a first step in the discussion of these problems, 
we consider spherical solutions of the wave equation, then passing 
on to the special case of electromagnetic fields. 

176. Spherical Solutions of the Wave Equation. — The wave 
equation can be solved by separation in spherical coordinates, as 
we have seen in Probs. 6, 7, and 8 of Chap. XV, and in Sec. 130, 
Chap. XVIII. The solutions are of the form e ±iut sin ra<£ 
Pi m (cos 8)R(r), where R satisfies a differential equation which, by 
a slight transformation of the results quoted above, can be written 

d 2 (rR) 

aS _ 1(1+ 1) 

«)2 *.2 

rR = 0. (1) 

dr 2 

The solution of the equation for R was shown in the problem 
quoted above to be expressible in terms of BessePs functions, of 
half integral order, divided by y/r. It proves to be possible, 
however, to express these functions in an alternative manner in 
terms of exponential or trigonometric functions, and we shall use 
that more elementary method in the present chapter. Further, 
we shall find that we have to consider only the very simplest 
types of spherical waves, for the purposes we are interested in. 



The simplest solution in spherical coordinates is the one inde- 
pendent of angle, for which I = 0. In this case, solving Eq. (1), 

. ucr 

we have rR = e ± • , giving as the solution of the wave equation 

» o+ia(t±r/v) -I 

the functions ■> reducing to - for the static case where 

r r 

(a = 0. This represents a sinusoidal wave, traveling out along r 

(if we have t - r/v) or in along r (if we have t + r/v), with a 

velocity v, and with an amplitude which decreases as 1/r. This 

decrease of amplitude is necessary if equal amounts of energy 

are to flow across all concentric spheres, since the intensity, 

proportional to the square of the amplitude, must be proportional 

to 1/r 2 so that its product with the area of the sphere may be 


A more general spherical wave can be obtained if we are not 

limited to sinusoidal vibrations. Thus the wave equation in 

spherical coordinates, neglecting terms in 6 and <t> which are zero 

for solutions independent of angle, is 

d\ru) _ 1 d 2 (ru) = . 

dr 2 v 2 dt 2 ' . U) ■ 

which has a general solution u = - \f(t - -J + g(t + - ) , as 

can be proved by direct substitution, where /, g, are arbitrary 
functions. This represents one wave traveling out from the 
center, another traveling in, with arbitrary wave form, and 

corresponds to the solution At - lx + m V + nz \ for the wave 

equation in rectangular. coordinates, expressing a plane wave of 
arbitrary wave form. 

More complicated waves are those which are not spherically 
symmetrical, but instead depend on the angles. We have seen 
in Sec. 140, Chap. XIX, that if 1/r is a solution of Laplace's 

equation, then ^-( -J is a solution, where n is an arbitrary direc- 
tion. This solution represents the potential of unit dipole, the 
differentiation giving the difference of the potentials of two oppo- 
site charges infinitesimally close together. If 6 is the angle 
between n and the direction in which we are finding the potential, 

, d/l\ d/l\ 1 

we nave ~{-\ = — ( -\ cos 6 = -- cos 6. This is a solution 

u = 


of Laplace's equation, and in terms of our standard solution in 
spherical coordinates it is the solution corresponding to Z = 1, 
m = 0. The function of r is r _(m) , in accordance with the 
results of Sec. 130. As a matter of fact, it can be shown that all 
the solutions of Laplace's equation, and therefore all the spherical 
harmonics, can be derived in this way by differentiations of the 
simple solution 1/r with respect to different directions. 

In a similar way, if we are given the solution • of the 

wave equation, we may differentiate with respect to n and again 
obtain a solution. This gives 

dj e ^-ryv)\ cog / 1 _ t«\ fc i.(*-r/.) cos 0. (3) 
dry r J \ r 2 rvj 

This is the solution corresponding to I = 1, as before, and the 
function of r in Eq. (3) is the alternative way mentioned above of 
writing the Bessel's function obtained by direct solution of Eq. 
(1). For values of r small compared with a wave length, the 
term 1/r 2 is large compared with w/rv, remembering that <a/v = 
2r/X. Thus at short distances the second term iD Eq. (3) can 
be neglected, and the function falls off as 1/r 2 , as in the static case. 
Further, at short distances, the quantity r/v in the exponent 
represents a time lag which is only a short fraction of the period 

of oscillation, so that we may neglect it, obtaining — 2 cos 

e*"', the potential we should expect from a dipole of variable 
moment e iut from a quasi-stationary argument in which we 
supposed that the variation of the moment was so slow that we 
could treat the dipole instantaneously as if it were constant. 
On the other hand, at large distances, the other term predomi- 
nates, and the solution of the wave equation falls off as 1/r. This 
part of the field is called the radiation field, and we see from it 
that this solution for a dipole persists to large r's just as does the 
spherically symmetrical solution, the intensity falling off as 
1/r 2 , and the field representing a wave traveling out with velocity 
v. This radiation field is a characteristic feature of solutions of 
the wave equation, and is not present in the limiting case of 
Laplace's equation. 

177. Scalar Potential for Oscillating Dipole. — Let a charge e 
oscillate up and down along the z axis, its displacement being 
given by the real part of Ce iut . We shall assume an equal 


and opposite charge to be always at the origin, so that the whole 
thing is electrically neutral, and constitutes a dipole of moment 

eCe io>t = Me ^t^ We wigh tQ find itg fieM We ghaU d() thifl 

by finding the scalar and vector potentials, first computing 
these directly, then in a later section showing that they can be 
easily obtained from another vector, called the. Hertz vector. 
The scalar and vector potentials are solutions of D'Alembert's 
equations, in which the charge and current densities, respectively, 
appear on the right sides of the equations. These are different 
from zero only at the dipole, which is assumed to be of infinitesimal 
dimensions, so that, except at the origin, the potentials satisfy 
the wave equation. We must then look for solutions of the wave 
equation satisfying the one condition that they reduce to the 
correct value at the origin, or at the dipole itself. The solution 
(3) is a function reducing to the scalar potential of a dipole in the 
limiting case of a static field, and we have seen that it also reduces 
to the value we should expect for points close to the dipole, even 
in a variable field. It corresponds to the scalar potential of a 
unit oscillating dipole. We expect, therefore, that for the dipole 
of moment M e iwt the scalar potential will be 

d /f>~ i<ar / c \ 

* = ~ M dl\~r~ ) cos deiat ' < 4 ) 

where now we write the velocity equal to c, for the case of light 

178. Vector Potential.— Next we may find the vector potential, 
using two facts: first, div A + (l/c)d<t>/dt = 0; second, since the 
current is always along the axis, the vector potential must also 
be in this direction. If now A is along the z axis, we easily 
have A r = A cos d, A = -A sin 6, A+ = 0, if A is the magnitude 
of the vector. Let us suppose tentatively that A is a function 
only of r (being prepared to reject this if it does not work). 
Then, using the equations for the divergence in spherical coordi- 
nates, we have 

div A = - 2l r(r 2 A cos 0) -\ t—JLf-A sin 2 d) 

r 2 dr v ' r sin 6 dd K J 


dA „™ a . 2A a s 2A cos d dA 

— cos e + — cos e _ = _ cos e. 

1 d<j> _ io)<f> 
c dt c 


Hence we have 

8 - ^M-^l?—-] cos Oe*' = 0. 
c ar\ r J 

This can be satisfied by 

, COS M — \ 


A = —M- 

c r 

We note that this, which represents A z , satisfies the wave equa- 
tion, as, of course, it must. Then we have 

A r = — M cos de wt , 

c r 

A e = M sin 6e iat , 

c r 

A* = 0. (5) 

179. The Fields. — Let us first find the magnetic field H = curl 
A. We at once have 

H r = He = 0, H+ = -4-(-—Me-^ c sin de'A- - -^ 
V dr\ c J r 38 

— M- cos de™ ) = M=* sin Oe^H - 1 - ^ V (6) 

c r J c L r y nor J 

From this we see that the magnetic field always goes in circles 
around the axis, as we should expect from the resemblance of the 
problem to that of a linear current. At large distances, the 
second term vanishes compared with the first, leaving 

H+ = 9 sin de wt . (7) 

1 dA 

Next we find the electric field E = — grad <j> — • We have 

c ot 

(9 T rl /p—i<»r/c\ ~] .,2 p—iur/c 

E r = i-\ M^-l- ) cos Be*"' + M% cos de™ 1 

ar\ dr\ r I J c 2 r 

e -icor/ C /2iu 2\ 

= M cos Be^l ■ — • + -5 V 

Eg = ~-M^-\ sin 0e iut - — 9 M- — ^ sin Be™ 

r dr\ r \ c z r 

= -M-s sin Be^H 1 + — =^ V 

c z r \ i(ar wV/ 

#* = 0. (8) 


From these results we see that at large r's (large compared 
with a wave length), E and H become equal to each other, at 
right angles, and at right angles to the direction of propagation, 

CO 2 o—iwr/c 

just as with a plane wave. They equal M -j sinfle^, 

* C T 

the amplitudes, apart from the sinusoidal parts, being M% — — 

c 2 r 

On the other hand, at small distances, the electrical field 
approaches that calculated for the dipole by electrostatics, 
falling off as 1/r 3 , while the magnetic field, 90 deg. out of phase 
with the moment, or in phase with the current, is proportional 
to 1/r 2 . At intermediate distances, the transition from the one 
situation to the other is of a complicated form. For discus- 
sion of radiation fields, it is only the result at large r that inter- 
ests us. 

The law giving the electric field at large r's can be put in an 
interesting form. First we take the acceleration, —co 2 Me ia,t . 
We imagine this to be a vector along the axis. Now if we wish 
the field at a certain point, we take a plane normal to r passing 
through this point, and project the acceleration vector on that 
plane, using, however, not the instantaneous value but the value 
at the previous time (t — r/c). The result, dividing by re 2 , gives 

^2 gi&)(<— r/c) 

— M-£ — sin 0, the correct value for the field. We see 

from this that the dipole sends out maximum radiation to the 
sides, none along the line of its motion. There is an interesting 
extension of this to the case of a particle vibrating, not in a line, 
but in an arbitrary ellipse (the most general sinusoidal motion). 
To get the field, we again project the acceleration vector, which 
is proportional to the displacement, on the normal plane. Thus 
the vector E in general traces out an ellipse, and the wave is 
elliptically polarized. An interesting case is that in which the 
charge rotates in a circle. Then at a point along the axis, the 
resulting light is circularly polarized; at a point in the plane 
of the circle, it is linearly polarized; between, it is elliptically 

180. The Hertz Vector. — There is another interesting way of 
considering the dipole solution, due to Hertz. The scalar and 
vector potentials satisfy the relation 

divA+i^ = 
c dt 



It would be convenient to have only one quantity from which the 
electromagnetic field can be derived. It is possible to find such 
a quantity, a vector n, called the Hertz vector. The above 
relation can be satisfied identically if we place 

i an 

A ~ c dt 

tf> = -divll. (9) 

This vector II satisfies the wave equation with no subsidiary 
conditions such as are imposed on the vector and scalar potentials. 
If any solution II of the wave equation is found, then this repre- 
sents an electromagnetic field, and the electric and magnetic 
fields are given by 

, ,. __ i a 2 n 

E = grad div II - - 2 -^ 

H = -cml^- (10) 

c at 

It turns out that the Hertz vector representing the field of an 
oscillating dipole is simply a spherically symmetrical solution 
of the wave equation. The correct solution, representing an 
outgoing wave, is 

n = A — il (ii) 


so that, if p represents the dipole moment of our oscillating charge 
(including the time variation) pointing along the z axis, it is easy 
to show that the vector and scalar potentials derived from this 
Hertz vector are just those derived in the previous sections. For 
example, the vector potential 

1 dp(t — r/c) 

and if 

the dipole moment, 

cr dt 

p - Me io,( - t ~ r/c) , 

t- = itaMe^ 1 -*'* 

giving for A the value we have already found 



In finding the vector potential, we must remember that n is a 
vector pointing along the z axis and has the components: 

n r = n cos 0; lie = — II sin 0; II,, = 0. 

If we take the divergence of this vector with a negative sign we 
are led to our first result for the scalar potential. The fields 
E and H must, of course, be the same as those discussed, since 
the vector and scalar potentials are identical. For convenience 
we write the expressions for them in vector notation. From the 
above equations relating E and H to n, we have 

# = grad div f*^^ 


1 p"(t - r/c) 


H = - curl 

tSL^iM] (12) 

where the dashes denote differentiation with respect to t. These 
expressions lead to the same values we have been using when p 
varies sinusoidally with its argument. They are somewhat more 
general since they hold for any periodic motion of the dipole. 

181. Intensity of Radiation from a Dipole. — We can easily 
compute Poynting's vector, and find the total rate at which the 
dipole is radiating energy. Poynting's vector is evidently 

cM 2 o> 4 cos 2 o(t — r/c) . , „ 

I" ~i 2 S111 d > 

<br c 4 r 2 ' 

the time average being 

M W sin 2 
8wc 3 r 2 

Let us now integrate over the surface of a sphere of radius r, to 
get the total radiation. The element of area is r 2 sin ddd<f>, so 
that the result is 

^M 2 C 2v C v . z MW 16ttWV 

8^ Jo Jo Sm $ dd d * = ~W - —8?-' (13) 

if v = co/2x is the frequency. This is a well-known formula for 
the radiation from a dipole. The two essential features are that 
the radiation is proportional to the square of the amplitude 
of the dipole, and to the fourth power of the frequency. 

182. Scattering of Light. — In addition to direct radiation, it 
is important to consider the process of scattering of light. Sup- 


pose that a wave, for example a plane wave, falls on a dipole of 
the sort we have considered. Let the dipole have an equation of 

mix + o}q 2 x) = eE, 

if m is the mass, e the charge, of the vibrating particle, E the 
external field, and x the displacement. Then, letting E = Eae*"*, 
we have ex, the moment, equal to 

E e l 

m co 

This is the oscillating dipole moment produced by the field. 
Then the dipoles set into motion by the wave will emit light, 
which is scattered. The rate of emission by a single dipole is 


4 (e 2 E V 

■>} \m coo 2 — <o 2 / 

Often the scattering is measured by the amount of light scattered 
per cubic centimeter of material, divided by the intensity of the 
incident light. The latter is (c/ir)(E X H), its mean value 
being cE 2 /Sr. Further, the amount scattered per cubic centi- 
meter is N times that scattered by a single dipole, if there are N 
dipoles and they scatter independently (as the molecules of a 
gas do). Hence for the scattering we have 

8xNe i 1 

3c 4 w 2 



There are three important special cases of this scattering 
formula : 

(a) The Rayleigh Scattering Formula. — This is what we have in 
the case where co is small compared with w . Since for ordinary 
atoms co is a frequency in the ultra-violet, we have this condition 
in the visible range of the spectrum. Then we may neglect 1 
compared with (co /co) 2 , obtaining for the scattering 

87riVe 4 fa) 4 ,- »v 

3c 4 mW* K } 

The scattering is here proportional to co 4 , or to 1/X 4 , where X 
is the wave length. This proportionality to the inverse fourth 
power of the wave length means that the short blue and violet 
waves will be scattered much more than the long red ones. An 


example is the scattering of light by the sky. The air molecules 
scatter, and on account of the law they scatter much more blue 
light, resulting in the blue color. The transmitted light thus 
has the blue removed and looks red, explaining the color near the 
sun at sunset. 

(b) The Thomson Scattering Formula. — In the other limiting 
case of x-rays, when the frequency is large compared with co , the 
scattering becomes 

3c 4 w 2 UD; 

This formula gives a scattering independent of the wave length, 
and is very important in discussing a>ray scattering by substances. 

(c) Resonant Scattering. — If to is nearly equal to o> , it is evident 
that the denominator can become very small (of course, if we 
consider damping, it will not vanish), resulting in a very large 
scattering. This phenomenon can be much more conspicuous 
than the two other cases. Thus a bulb filled with sodium vapor, 
which has a natural frequency in the visible region, illuminated 
with light of this color, will scatter so much light that it appears 
luminous. This phenomenon is called resonance scattering. 

183. Polarization of Scattered Light. — We observe that, if 
the incident light is plane polarized, the dipoles will all vibrate 
along the direction of its electric vector. Thus there will be no 
intensity in the scattered light along this direction. The scat- 
tered light will have a maximum intensity at right angles, and it 
will be plane polarized. It was by experiments based on these 
facts that the polarization of x-rays was first found. 

184. Coherence and Incoherence of Light. — In the previous 
paragraphs we calculated the scattering by N molecules which 
scatter independently by adding the intensities of the scattered 
radiation from each. The justification for this requires closer 
consideration. Since the Maxwell equations are linear the field 
vectors E and H satisfy the superposition principle, so that we 
should expect the total amplitude to be the sum of the amplitudes 
in the various waves, in which case the total intensity, being the 
square of the amplitude, would certainly not be the sum of 
the separate intensities. The key to this situation is found in the 
relations between the phases of the various waves which we are 
adding: if they are all in the same phase, they are said to be 
coherent, and the amplitudes add, while if they are in phases. 


having random relations to each other they are incoherent and 
the intensities add. 

To be more precise, let us consider the sum of a number of 
sinusoidal waves, all of the same frequency, but of different 
amplitude and phase: 

Y^Ucos (cd - a k ) = (2)A*cosa*) cosco* + (^A k sma k ) sinco*. 
k k * 

If all the phases should be the same, say a k = 0, then the 
amplitudes of the cosine and sine terms will be ^A k and 0, 


respectively, so that the amplitudes add, and the intensity is 
proportional to (^A fc ) 2 , or if, for instance, there are N terms of 


equal amplitude, proportional to N 2 times the intensity of a 
single wave. On the other hand, the as may be completely 
independent of each other, meaning that each a is equally likely 
to have any value between and 2w, independent of the others. 
Then we can see that y.A k cos a k will be far less than ^A k , since 

k * 

we shall have just about as many terms with positive values of 
cos a k as with negative, and the terms will just about cancel. 
The cancellation will not be complete, however, as we see if we 
compute the squares of the summations, which we must add to 
get the intensity. The square of the first summation, for 
instance, is 

(y\A k cos a fc V = 2% 2 cos2 ak + XX Ak Al C0S ak cos au 
k k tei 

We must find the average of this, taking the as as independent. 
That is, we must perform the operation of integrating each a 
from to 2tt and dividing by 2tt. When we do this, the terms 
cos 2 a k average to )4, while the products of two independent 

a's average to zero, leaving ^^i^ 2 - The ° ther summation 


gives an equal term, so that we find that the mean square ampli- 
tude, or mean intensity, averaged over phases, is the sum of the 
individual intensities. This is the state of complete incoherence, 
in which for N waves the intensity is N times the intensity of 
a single wave, rather than N 2 as for the coherent case. The 
cancellation of waves, then, while not complete, is more and more 


perfect as N increases, for N becomes a smaller and smaller frac- 
tion of N 2 as N increases. 

We can now apply the idea of coherence to the scattering of 
light from a gas. The phase of the wave at a point P, scattered 
by an atom at a (Fig. 49), depends on the total path the light has 
traveled from the source to a, and from a to P. Since the mole- 
cules of a gas have no fixed positions with respect to each other, 
these paths are in a random relation to each other, the phases are 
incoherent, and we are justified in adding intensities. Such a 
procedure would not be allowed for example in discussing the 
scattering of x-rays by crystals, where the various atoms are in 
fixed lattice positions. Indeed, here we do get interference, and 
it is just by studying the interference patterns so obtained that 


Fig. 49. — Scattering from atoms. 

(a) At right angles to the incident beam, where the paths of the scattered light 
from the atoms a, b, c are of different and random lengths, so that there is no 
regular interference, and we add intensities. 

(6) Scattering straight ahead, where the paths are approximately equal, and 
the beams interfere to produce the refracted beam. 

we obtain our information about the lattice structure of crystals. 
Neither would the procedure be allowed in discussing the scatter- 
ing from a gas in the same direction as the incident radiation, 
as in (6). For then the paths of the beams scattered from the 
various atoms are approximately equal, the waves are in phase, 
and they produce a resultant field at P proportional to the ampli- 
tude, rather than the intensity, of the incident wave. This 
scattered field can be shown to interfere with the incident wave 
in such a way that the resultant produces the refracted wave. 
The close relation of our scattering formulas to the formulas for 
the index of refraction, therefore, becomes clear, and it is evident 
that our two problems of refraction and scattering, though we 
have treated them separately, are really parts of the same sub- 
ject. The scattering straight ahead produces refraction, and 
does not depend on the exact placing of the molecules. Scatter- 
ing to the sides, on the other hand, does not occur unless the 



molecules have a random arrangement, and then the intensity, 
not the amplitude, is proportional to the number of molecules. 
185. Coherence and the Spectrum. — The amplitude of a wave, 
as a function of time, is never exactly sinusoidal, but is really a 
much more complicated function. It is often desirable, how- 
ever, to resolve such a function into a spectrum; that is, write it 
as a sum of sinusoidal waves of different frequency. This can 
be conveniently done by Fourier series. To do this, we take a 
Fourier series with an extremely long period T, so long that all 
the phenomena we are interested in take place in a time short 
compared with T, so that we are not bothered by the periodicity 
of the series. Then, if our function is /(f), we have 

/(f) = ^ (An COS 0) n t + B n Sm Q} n t), 


o C T/2 2 C T/2 

A n = w\ f(t) cos ajdt, B n = ^ f(t ) sin <a„t dt , 

TJ-T/2 J-J-T/2 

Wn = Y 11 - ^ 

This gives an analysis into an infinite number of sine waves, with 
frequencies spaced very close together (on account of the very 
small size of 2*/T). No actual, physical wave is then perfectly 
sinusoidal, in the sense of having but one term in this expansion 
with an amplitude different from zero. We shall show in a prob- 
lem that even a perfectly sinusoidal wave which persists for only 
a finite length of time will have appreciable amplitudes for all 
those frequencies within a range Aco, equal in order of magnitude 
to the reciprocal of the time during which the wave persists, so 
that a sine wave of long lifetime will correspond to a sharp line 
in the spectrum, while a rapidly interrupted wave will give a 
broad line. This is observed experimentally in the fact that 
increasing the pressure of a gas, thereby making collisions more 
frequent and interrupting the radiating of the atoms, broadens 
the spectral lines. 

The intensity is proportional to p(t), or to the square of the 
summation over frequencies. Just as before, this square consists 
of terms like A n 2 cos 2 <o„f, and cross terms like A n A m cos uj 
coSiCdJ . Instantaneously none of these terms are necessarily zero. 
But if we average over time, the terms of the first sort average to 
A n 2 /2, while those of the second sort average to zero. The final 


result, then, is that the time average intensity is the sum of the 
intensities of the various frequencies: p(t) = s^*( A n 2 +B n 2 ) 


We are justified in considering the terms connected with a given 
n to be the intensity of light of that particular frequency in the 
spectrum, so that we have the theoretical method of determining 
the spectral analysis of any disturbance. And we see that the 
following statement is true: on a time average, sinusoidal waves 
of different frequencies are always incoherent, and never interfere. 
186. Coherence of Different Sources. — It is known experi- 
mentally that light from two different sources never interferes; 
to get interference we must take light from a single source, split 
it into two beams, and allow these beams to recombine. If we 
regarded the sources as being monochromatic, it would be hard 
to see why this should be, for the amplitudes of two waves of the 
same frequency should add, rather than the intensities, and this 
is the essence of interference. But when we observe that each 
source really is represented by a Fourier series, the situation 
becomes plain. For two sources are always so different that their 
Fourier series will be entirely different. If we analyze both of 
them, the phase of the radiation of frequency <o n from one will 
be entirely independent of the phase of the corresponding fre- 
quency from the other. Thus if we add the disturbances, square, 
and average over this random relation between the phases of the 
two sources, the cross terms will cancel, and the intensities add. 
The randomness comes in this case, not in adding a great many 
terms of the same frequency, but in combining the terms of 
different frequencies, which are related in entirely independent 
ways in the two sources. 


1. Discuss the weakening of sunlight on account of scattering, as the light 
passes through the atmosphere. Assume that the molecules of the atmos- 
phere have a natural frequency at 1,800 A. (where absorption is observed). 
Let each molecule contain an electron of this frequency. Assume that the 
number of molecules is such as to give the normal barometric pressure. 
Find the fractional weakening of a beam due to scattering in passing through 
a sheet of thickness ds, and from this set up the differential equation for 
intensity as a function of the distance. Solve for the ratio of intensity to 
the intensity before striking the atmosphere, for the sun shining straight 
down, and for it shining at an angle of incidence of 60 deg. Constants: 


e = 4.774 X 10~ 10 e.s.u., m = 9.00 X 10~ 28 gm., number of molecules in 
1 gm.-mol = 6.06 X 10 23 . 

2. A vibrating dipole radiates energy, and therefore its own energy- 
decreases. Noting that the rate of radiation is proportional to the energy, 
set up the differential equation for the energy of the dipole as a function 
of the time. Find how long it takes the dipole to lose half its energy. Work 
out numerical values for the sort of dipole considered in Prob. 1. 

3. Using the results of Prob. 2, find the equivalent damping term which 
would make the dipole lose energy at the same rate as the radiation. This 
damping is called the radiation resistance. 

4. Show that the values for E and H, which we have found, satisfy Max- 
well's equations, by direct calculations in polar coordinates. 

5. Derive the expressions for E and H in terms of the Hertz vector n from 
the equations defining II. * ■ 

6. Show that the fields E and H in terms of p(t - r/c) and its time deriva- 
tives reduce to the values in terms of the dipole moment M. 

7. Show that near an oscillating dipole the magnetic field is given by 

H = ~Jr X p'(t)} 

and thus can be derived from the Biot-Savart law when we place 

p'(t) = I(t)ds, 
where I(t) is the current and ds an element of length in the direction of the 

8. Show from the Hertz vector for the dipole case, that at large distances 
from the dipole, 


B--M. rxp "( t -i)} 

9. Suppose we have an alternating current of maximum value / (meas- 
ured in e.m.u.) in a vertical antenna of length I. Treating this as a dipole, 
show that the total radiation is 

4tt 2 c l 2 P 
3 X 2 

Show that the equivalent resistance necessary to produce the same power 
loss (the radiation resistance) -is 

R = 80*-^ 

if R is measured in ohms, and if we place c = 3 X 10 10 cm. per second. 

10. Find the spectrum of a disturbance which is zero up to t = 0, is 
sinusoidal until t = T , then is zero permanently. (Hint: make the period T 
of the Fourier series indefinitely large compared with To.) 

11. Find the spectrum of a disturbance which starts at t = 0, and is a 
sinusoidal damped wave after that. Show that the curve for intensity as a 


function of frequency has the same form as a resonance curve, in general, 
and that its breadth is connected with the logarithmic decrement in the 
same way. This illustrates an important principle : the emission and absorp- 
tion spectrum of the same substance are essentially equivalent. The 
resonance curve represents the absorption curve, on account of the relation 
of forced oscillators and dispersion, while the damped wave is the emission. 
(Hint: make the period T indefinitely large compared with the time taken 
for the oscillation to fall to 1/e th of its value.) 


Huygens' principle is a well-known elementary method for 
treating the propagation of waves, and in this chapter we shall 
consider its mathematical background, showing its close connec- 
tion with Green's theorem. The method is this: From each 
point of a given wave front, at t = 0, we assume that spherical 
wavelets start out. At time t, each wavelet will have a radius 
ct, and the envelope of these wavelets will form a new surface, 
which according to Huygens is simply the resulting wave front 
at this later time t. Thus, if the original wave front was a 
plane, it is easy to see that the final one will be a plane distant 
by the amount ct, while, if it is a sphere, the final wave front 
will be a concentric sphere whose radius is larger by ct. In 
either case this construction gives us the correct answer, agreeing 
with the more usual methods of computation. The one diffi- 
culty is that our construction would give a wave traveling back- 
ward, as well as one traveling forward; the solution of this 
difficulty appears when we use the methods of this chapter. 

We may look at our process in a slightly different way, not 
used by Huygens, but developed later when the interference 
of light was being worked out. Suppose that, instead of taking 
the envelope of all the spherical wavelets, we consider that each 
of these wavelets has a certain amplitude, consisting of a sinu- 
soidal vibration. We then add these vibrations, just as if 
the wavelets were being sent out by interfering sources of light, 
and the resulting amplitude is taken to be that in the actual 
wave. This process can be shown to lead to essentially the same 
result, and it is this which can be justified theoretically. As 
a further generalization, it is not necessary to take the original 
surface to be a wave front; it can be any surface, so long as we 
allow the scattered wavelets to have the suitable phase and 

Our final result, then, is this: The disturbance at a point P 
of a wave field may be obtained by taking an arbitrary surface, 



and performing an integration over this surface. The contribu- 
tion of a small element of area dS of this surface equals the 
amplitude at P of a spherical wave starting from dS at such a 
time that it reaches P at time t. That is, if the distance from 

dS to P is denoted by r, this wave is of the form ^ ~ r ' c K 


Now the contribution, for a given wavelet, must surely be pro- 
portional to the disturbance at dS, which we may call / (a func- 
tion of time and position), and to dS. Hence we have something 

C Cfff _ r / c ) 
like I I — dS for the final result. We are thus led to 

a formula of this sort: 

/ (at a point P) = constant X I IfSjZlM d$, 

where the surface integral is over a surface surrounding P. 
This suggests the solution of Laplace's equation by Green's 
method, where we had the value of a function <f> at an interior 
point of a region where v 2 <£ was zero as a surface integral over 
the boundary. As a matter of fact, an analogue to Green's 
theorem is the correct statement of Huygens' principle, and 
replaces the formula which we have derived intuitively above, 
and which is not just correct. 

187. The Retarded Potentials— In Chap. XXI, we have 
introduced scalar and vector potentials <f> and A, giving the 
electric and magnetic fields by the relations 

E= -grad - \ d A 
H = curl A. 
For these potentials we found the equations 

or D'Alembert's equation. We ask first how to get a solution 
of D'Alembert's equation analogous to the simple solution 



of Poisson's equation. • We shall not carry through the proof of 
the solution, for that is rather complicated. But the essence 
of Poisson's equation is that we divide up all space into volume 
elements dv, and that pdv/r is the potential of the point charge 
pdv at a distance r. This potential, of course, is a solution of 
Laplace's equation, as is 1/r, at all points except for r = 0, 
where the charge is located. 

In a similar way, to solve D'Alembert's equation, we divide 
up our charge into small elements, and write the potential 
as the sum of the separate potentials of these small charges. 
The separate potentials must now be, except at r = 0, solutions 
of the wave equation. This means that, since any change of 
the charge will be propagated outward with the velocity c, 
the potential at a given point of space resulting from a particular 
charge cannot be derived from the instantaneous value of the 
charge, but must be determined, instead, by what the charge 
was doing at a previous instant, earlier by the time r/c required 
for the light to travel out from the charge to the point we are 
interested in. In other words, if p(x, y, z, t) is the charge density 
at x, y, z at the time t, and r is the distance from x, y, z to x', 
y',z', where we are finding the field, we shall expect the potential 
of the charge in dv to be 

p(x, y, z, t - r/c)dv ^ 

and for the whole potential we shall have 

. _ i f f Cv±Z?mL£. (3) 

This solution is, as a matter of fact, correct. We have already 

seen that *" ~ r ' ■ is a solution of the wave equation, where 

/ is any function, so that the integrand actually satisfies the 

wave equation, as in the earlier case 1/r satisfied Laplace's 

equation. The potential <p determined by this equation is 

called a retarded potential, since any change in the charge is not 

instantaneously observable in the potential at a distant point, 

but its effect is retarded on account of the finite velocity of 


light. The solution for the vector potential is determined in 
an analogous manner. 

188. Mathematical Formulation of Huygens' Principle. — In 
discussing the application of Green's theorem to the solution of 
Poisson's equations in a finite region of space, we have proved 

the result of the last paragraph being the special case where the 
region of integration is infinite and the surface integral drops 
out. We now wish to find an analogous theorem for use with 
D'Alembert's equation. Here again we shall not give a real 
derivation, for this is very complicated, but shall merely describe 
the formula which results, and show that it is plausible. We have 
already discussed the volume integral. In the surface integral, 
the first term gave the potential of a double layer of strength 
(t>/4r, the second the potential of a surface charge of magnitude 

j- -r— Each of the terms, <j> . and - ~, is a solution 
4t an an r dn 

of Laplace's equation since it represents the potential of certain 

In our case of the wave equation, the formula has two corre- 
sponding terms: one giving the potential of a double layer, 
the other of a surface charge. But now the charges change with 
time, so that we must use solutions of the wave equation in 
the integral. We have already seen that the solution of the wave 

equation corresponding to - is — ^-J; hence we expect the 

second term to be replaced by — ( ^r ) , where this means 


that the partial derivative, which is now a function of time as 

well as of position on the surface, is to be computed, not at t, 

r d(l/r) 

but at i Similarly corresponding to ■ \ , the differ- 

c an 

ence of the potentials of two equal and opposite point charges at 
neighboring points of space, we have — y— — f. Remem- 
bering that in differentiating with respect to n we must regard r 
as a variable each time it occurs, this is 


/(<-3^ + ;/I'H)]= 

_ cos {n, r) f f(t - r/c) 1 df(t - r/c) \ 
r \ r c dt j 

where in the last term we have used the relation 

df(t - r/c) = df(t - r/c) d(t - r/c) = df(t - r/c) / 1 jh\ 

dn d(t — r/c) dn dt \ c dn) 

1 ( n df(t - r/c) 

= — cos (n, r) -^ — ^-. 

c dt 

We should, therefore, expect to have 



This, as a matter of fact, is the correct formula. The first term 
represents the potential due to all the charge within the volume ; 
if there are no sources of light within this volume, the volume 
integral is then zero, and that is the usual case with optical 
applications. The surface integral represents the remaining 
potential as arising from a distribution of charge and double 
distribution about the surface, each surface element sending 
out a wavelet which on closer examination proves to be the 
Huygens' wavelet we are interested in. Thus, starting from 
Green's theorem and D'Alembert's equation, we have arrived 
at a mathematical formulation of Huygens' theorem. 

To give a suggestion of the rigorous proof of this formula, 
we could proceed as follows: First, we notice that <j> defined by 
this integral satisfies the wave equation; for since each term 
of the integrand separately is a solution, the sum must also 
be. Now it follows from this, although we have not proved it, 
that if the solution reduces to the correct boundary values at 
all points of the boundary, the solution must be the correct one, 
the reason being essentially that the boundary values determine 
a solution uniquely, so that, if we have one solution of the 
equation with the right boundary values, it must be the only 


correct solution. We must then show that the <j> denned by the 
integral actually has the correct boundary values. This could 
be done by a more careful treatment, and we should then have a 
demonstration of the formula. The more conventional proof, 
however, is a fairly direct though complicated application of 
Green's theorem. 

189. Application to Optics. — We shall now take our general 
formula (4), and apply it to the cases we meet in optics, showing 
that it reduces to something like the formula which we had earlier 
derived intuitively. We suppose that light is emitted by a point 
source, and that the value of some quantity connected with, and 
satisfying, the wave equation (one of the components of the fields 
or potentials — they all satisfy the same relations) has the form 

> where ri is the distance from the source to the point 

where we wish to find the disturbance. Then we wish to get 
the disturbance at P, not by direct calculation, but by using 
Huygens' principle. Suppose we take a closed surface. This 
surface can either surround the source, or the point P where we 
wish the disturbance. In any case, we have n as the normal 
pointing out of the part of space in which P is located. At a 

point of the surface, <f> = , where n is the distance 

from the source to the point on the surface. We then have, 
if r is the distance from P to a point on the surface, 


J^ e 2viv[t— (r+r t )/c] 


d<f>(t — r/c) _ ZirivAe 2 ™^-^^' ] 
■ dt 7i 

d<f>(t — r/c) / 1 2iriv\ e M * t -l r + r *>/e] 

s = -i« («, n) {- + _j _ , 

Thus finally 

~(^ + ? 7 !: )cos(n,r 1 )|^. (5) 

In this formula, as in Chap. XXV, we have two sorts of terms, 
some significant at small values of r and r lf others at large. 


We easily see that, if r and r x are large compared with a wave 
length, as is always the case in optics, the only terms we need 

retain are those in Hence to this approximation 

* = I I W~ e 2 ™ [ '~ (r+ri)/c] [ cos (n, r) - cos (n, n)] dS. (6) 

This final form suggests our earlier, intuitive formulation of 

Huygens' principle. The incident amplitude at dS is 

Now we set up, starting from dS, a wavelet whose amplitude 
is this value, retarded by the amount r/c, divided by r, and 

multiplied by the factor j— [cos (n, r) — cos (n, r{)]dS. This 

is just what we should expect, except for the last factor. The 
term i introduces a change of phase of 90 deg., not present in 
Huygens' form of the principle, but necessary. The term 
cos (n, r) — cos (n, n) makes the wavelets have an amplitude 
which depends on angle. When r and r x are in opposite direc- 
tions, which is the case when the surface is between the source 
and P, the factor approaches 2, while when r and r x are parallel, 
and the surface is beyond P, it becomes zero. This means that 
the wavelets do not travel backwards, thus removing the diffi- 
culty noticed earlier in Huygens' method. The wavelets have 
an amplitude depending on their wave length, decreasing for 
the longer wave lengths. 

190. Integration for a Spherical Surface by Fresnel's Zones. — 
Let us now carry out our integration, and verify Huygens' 
method, in a simple case. We take the surface to be a sphere, 
surrounding the source, and therefore a wave front. We note 
that n is the inner normal of the sphere. Thus r\ is constant 
aii over the sphere, and cos (n, n) = — 1 at all points, so that 
the formula simplifies to 

* = — 2x^ — J J ~~T~ [cos (n ' r) + ] 

Now suppose we introduce, as a coordinate on the sphere, the 
distance r from the point P; that is, we cut the sphere with 
spheres concentric with P, laying off zones between them, as in 
Fig. 50. We can easily get the area between r and r + dr, and 
hence the element of area. Take as an axis the line joining 


the source and the point P, and consider a zone making an angle 
between and + dd with the axis. The area of the zone is 
27rri 2 sin dd. But now by the law of cosines, if R is the distance 
from the source to P, r 2 = R 2 + n 2 - 2Rr x cos 0, and differ- 
entiating, 2rdr = 2Rr x sin dd. Hence for the area of the zone 

we have * x dr. Introducing this, we have 

/ r max 
e -^r A[cos{nfr) + 1]dr) 
r min 

where r m { n = R — r i} r ma x = R + n. 

To carry out this integration, we use a device called Fresnel's 
zones, giving us an approximate value in a very elementary way. 


Fig. 50.— Construction for Fresnel's zones on a sphere surrounding the source. 

Beginning with r™, we take a set of zones such that the outer 
edge of each corresponds to a value of r just half a wave length 
greater than the inner edge. The contributions of successive 
zones will almost exactly cancel. The integral, then, consists 
of a sum of terms, say si - s 2 • • • + s», where the magnitudes of 
Si, s 2 . . . , vary only very slightly from one to the next. Now 
it is true in general that in such a series the sum is approximately 
half the sum of the first and last terms. We can see this as 

follows. We group the terms T + ( $T — S2 ~*~2/ ' ' ' ~*~ 
f^l — Sn _ 1 _|_ ?M + p. Now, on account of the slow varia- 

tion of magnitude, we have very nearly Sk = » ^ **" s 

were so, however, each of the parentheses would vanish, leaving 

only Si ~T Sn - In our case, the contribution of the first zone is to be 

considered, but that of the last zone is practically zero, on account 
of the factor cos (n, r) + 1, so that the result is half the first zone. 


Now, in the first zone, cos (n, r) + 1 is so nearly equal to 2 
that we can take it outside the integral, obtaining 


X R jR~r, 

A y,2viv(t—ri/c) 



R-ri + \/2 

,— 2jri(fl— n)/\ 

— , the correct value. (7) 


191. The Use of Huygens' Principle. — In the derivations of 
this chapter we have traveled in a very roundabout way to reach 
a very obvious result. We naturally ask, what is Huygens' 
principle good for, aside from a mathematical exercise? The 
answer is found in the problem of diffraction. There one has 
certain opaque screens, with holes in them, and a light wave fall- 
ing on them. If the light comes from a point source, geometrical 
optics would tell us that the shadow of the screen would have 
perfectly sharp edges. But actually this is not true; there are 
light and dark fringes around the edge of the shadow. If the 
shadow is observed at a greater and greater distance, these fringes 
get proportionally larger and larger, until they entirely fill the 
image of the hole. Finally at great distances the fringes grow in 
size until the resulting pattern has no resemblance at all to the 
geometrical image. There are then two general sorts of diffrac- 
tion: first, that in which the pattern is like the geometrical image, 
but with diffuse edges, and which is called Fresnel diffraction; 
secondly, that in which the pattern is so extended that it has no 
resemblance to the geometrical image, and which is called Fraun- 
hofer diffraction. Both types of diffraction, as well as the inter- 
mediate cases, can be treated by using Huygens' principle. 

192. Huygens' Principle for Diffraction Problems. — Suppose 
that light from a point source falls on a screen containing aper- 
tures, and that we wish the amplitude at points behind the screen. 
Then we surround the point P, where we wish the field, by a 
surface consisting of the screen, and of a large surface, perhaps 
hemispherical, extending out beyond P, and enclosing a volume 
completely. We apply Huygens' principle to the surface. In 
doing so, we assume (1) that the amplitude of the incident wave, 
at points on the apertures, is the same that it would be if the 


screen were absent; and (2) that immediately behind the screen, 
and at points of the hemispherical surface as well, the amplitude 
is zero, the wave being entirely cut off by the screen. This is, 
of course, an approximation, since at the edge of a slit, for exam- 
ple, the amplitude of the wave does not suddenly jump from zero 
to a finite value. The exact treatment is exceedingly difficult, 
but in the one case for which it has been worked out, it substanti- 
ates our approximations. 

To find the disturbance at P, then, we integrate over the sur- 
face, but set the integrand equal to zero, except at the openings 
of the screen, obtaining 

C CiA 1 

* = J J 2^ e2 "" [ '" (r " Kl>A1[cos ( n > r > ~ cos &> r ^ dS > 

the integral being over the openings. We note that only the 
edges of the openings are significant, the shape of the screen 
away from the opening being unimportant. Now let us assume, 
as is almost always true in practice, that the distances r x and r, 
from source to screen and from the screen to P, are large compared 
with the dimensions of the holes. Then \/rr x and [cos (n, r) — 
cos (n, ri)] are so nearly constant over the aperture that we may 
take them outside the integral, replacing r and r x by mean values 
f and f i. If in addition we write r + n in the exponential as 
f + f i + r' + n', where r' and r x ' are the small differences 
•between r and r t and their values at some mean point of the aper- 
ture, we have finally 

* = 2~X W x 1 - C0S ^ ^ ~ C0S ^ fl ^ « 2 ™1 r c ri) J f Ce~ M(r '+ r ^/^dS. 

The whole factor outside the integral may be taken as a constant 
factor so that, if we are interested only in relative intensities, 
we may leave it out of account. We finally have a sinusoidal 
vibration of which the amplitudes of the components of the two 

phases are proportional to C = | jcos — (/ + n') dS, and S' = 

I I Sin ~X ^ r ' "*" ri ^ d ^' • Hence tne intensity is proportional to 

C" 2 + S' 2 , and our task is to compute this value. 

193. Qualitative Discussion of Diffraction, Using FresneFs 
Zones. — By using Fresnel's zones, one can see qualitatively the 



explanation of the diffraction fringes, particularly in Fresnel 
diffraction. Suppose that we join the source S and a point P 
with a straight line, as in Fig. 51, and consider the point of the 
screen cut by this line, a point for which r + r x has a minimum 
value. Let us surround this point by successive closed curves in 
which r + ri differs from its minimum value by successive whole 
numbers of half wave lengths. It is not hard to see that these 
curves will be the intersections with the screen of a set of ellipsoids 
of revolution, whose foci are S and P. Hence if the line SP is 
approximately normal to the screen, the curves will be approxi- 
mately circles. Successive zones included between successive 

r x +r~ constant 

Fig. 51. — Fresnel's zones on a plane. 

curves will propagate light differing by a half wave length from 
their neighbors. Now on the screen we may imagine the pattern 
of zones, and also the apertures. The whole nature of the diffrac- 
tion depends on what zones are uncovered, and can transmit light, 
and what ones are obscured by the screen. We may distinguish 
three eases, shown in Fig. 52 : 

1. The center of the system of zones lies well inside the aper- 
ture. The central zone is entirely uncovered, as are a number of 
the others. As we get to larger zones, we shall come to one of 
which a small part is covered; then one which is more covered; 
and so on, until finally we come to one only slightly uncovered; 
and then the rest are entirely obscured. Now we can write our 
integral, as in paragraph 190, as a sum of integrals over the 
successive zones. As before, these contributions will decrease 
very gradually from one zone to the next. When we reach the 


zones that are obscured, the decrease will become a little more 
rapid, but not so much as to interfere with the argument. We 
can still write the whole thing as half the sum of the first and 
the last zones. In our case, the last zone which contributes has 
a negligibly small area exposed, so that it contributes practically 
nothing, and the whole integral is half the first zone. But this 
gives just the intensity we should have in the absence of the 

2. The center of the zone system is well behind the screen (P 
is in the geometrical shadow). Then the first few zones are 

2 3 

Fig. 52. — Fresnel's zones and rectangular aperture. 

(1) Directly in path of light. 

(2) In geometrical shadow. 

(3) On edge of shadow. 

obscured. A certain zone begins to be uncovered, until finally 
some zones are uncovered to a considerable extent. Large zones 
become obscured again, however. Thus in our sum, while there 
are terms different from zero, both the first and the last terms 
are zero, so that the sum is zero. The intensity well inside the 
geometrical shadow is zero. 

3. The center of the zone system is near the edge of the screen. 
Then the first zone may be partly obscured, so that there is some 
intensity, but not so great as without the screen. Or the first 
zone may be entirely uncovered, but the next ones ' partly 
obscured. In these cases, the contributions from the successive 
zones may differ so much that our rule of taking the first and last 
terms is no longer correct. It is possible for the whole amplitude 
to be more than half the first zone, so that the intensity is actually 
greater than without the screen. As we move into the geometri- 
cal image from the shadow, it turns out that there is a periodic 


fluctuation, on account of the uncovering of successive zones, 
and this explains the diffraction fringes. 


1. Try to carry out exactly the integration which we did approximately 
by using Fresnel's zones. 

2. The source is at infinity, so that a wave front is a plane. Set up Fres- 
nel's zones, and find the breadth of the nth zone, and its area. 

3. A plane wave falls on a screen in which there is a circular hole. Inves- 
tigate the amplitude of the diffracted wave at a point on the axis, showing 
that there is alternate light and darkness as either the radius of the hole 
increases, or as the point moves toward or away from the screen. (Sugges- 
tion: the integral consists of a finite number of zones.) 

4. A plane wave falls on a circular obstacle. Show that at a point 
behind the obstacle, precisely on the axis, there is illumination of the same 
intensity which we should have if the obstacle were not there. Explain 
why this would not hold for other shapes of the obstacle. 

6. Take a few simple alternating series, as 1/2 — 1/3 + 1/4 — 1/5 • • • , 
1/2 - 1/4 + 1/8 • • • , 1/22 _ 1/32 + !/ 4 2 -..-., etc., and find 
whether our theorem about the sum of a number of terms is verified for 
them. In doing this, it may be necessary to start fairly well out in the 
series, so as satisfy our condition that successive terms differ only slightly 
in magnitude. 

6. Prove the statement that the boundaries of Fresnel's zones are the 
intersection of the screen with ellipsoids of revolution whose foci are the 
source and the point P. What happens to these ellipsoids as the source is 
removed to infinity? 



In the present chapter we proceed to the mathematical dis- 
cussion of Fresnel and Fraunhofer diffraction, based on the 
methods of Huygens' principle derived in Chap. XXVI. The 
problems which we take up are Fresnel and Fraunhofer diffrac- 
tion through a slit; Fraunhofer diffraction through a circular 
aperture; and the diffraction grating, an example of Fraunhofer 
diffraction. In Eq. (8) of the last chapter, we have seen that the 
essential step in computing the diffraction pattern is the evalua- 
tion of the integral 

where the integration is over the aperture of the screen, dS is an 
element of surface in the aperture, r is the distance from the 
source to the element dS, and r x the distance from the element 
to the point P where the field is being found. If the incident 
wave is a plane wave, and the plane of the aperture is a wave 
front, then r is the same for all elements, and the factor e~ 2 * ir/x 
can be cancelled out of the integral. The remaining integral, 
jj e -2wi ri /\ dg } represents the sum at P of the amplitudes of 
spherical waves of equal intensity and phase starting from all 
points of the aperture. It is the interference of these waves which 
produces the diffraction pattern. 

194. Comparison of Fresnel and Fraunhofer Diffraction. — 
The two types of diffraction, Fresnel and Fraunhofer, arise 
from observing the pattern near to, or far from, the screen. 
Let the normal to the screen be the z axis, as in Fig. 53, and let 
the screen containing the aperture be at z = 0. The light 
passing through the aperture is caught on a second screen at 
z = R. Physically, the diffraction pattern has the following 
nature: close to the aperture, the light passes along the z axis 
as a column or cylinder of illumination, of cross section identical 
with the aperture, so that, if the screen at R is close to the 
aperture, the illuminated region will have the same shape as 
the aperture, and we speak of rectilinear propagation of the light. 




As R increases, however, the column of light begins to acquire 
fluctuations of intensity near its boundaries, so that the pattern 
on the screen has fringes around the edges. This phenomenon 
is the Fresnel diffraction. The size of the Fresnel fringes 
increases proportionally to the square root of the distance R. 
Thus Fig. 54 shows, in its upper diagram, the slit, parallel 
column of light, and parabolic lines starting from the edges 
of the slit, indicating the position of the outer bright fringe 
of the Fresnel pattern, if we are sufficiently near to the slit. 
As R becomes larger, the fringes become so large that there are 
only one or two in the pattern of the aperture, and the pattern 

Fig. 53. — Aperture and screen for diffraction through rectangular slit. 

shows but small resemblance to the shape of the aperture, though 
it still is of roughly the same dimensions. With further increase 
of R, we finally enter the region of Fraunhofer diffraction. Here 
the beam of light, instead of consisting of a luminous cylinder, 
resembles more a luminous cone (indicated by the diverging 
dotted lines in the top diagram of Fig. 54). Thus the Fraunhofer 
pattern becomes larger and larger as R increases, being in fact 
proportional to R, so that we can describe it by giving the angles 
rather than distances between different fringes. Often Fraun- 
hofer diffraction is observed, not by placing the screen at a great 
distance, but by passing the light through a telescope focused 
on infinity. Such a telescope brings the light in a given direction 
to a focus at a given point of the field. Thus it separates the 
different Fraunhofer fringes, since each of these goes out from 
the source in a particular direction. In Fig. 54, diffraction 
patterns are shown indicating the transition from Fresnel to 
Fraunhofer diffraction. The pattern a illustrates the Fresnel 



pattern for one edge of an infinitely wide slit. The patterns 
b to g represent the actual diffraction patterns from the slit, 
at distances indicated in the upper diagram. These patterns 
are all drawn to the same scale. They are drawn for a slit 


_„—- "■ ~"~~ 










Fig. 54. — Transition from Fresnel to Fraunhofer diffraction for a slit, 
(a) Fresnel pattern for edge of infinitely wide slit. 

(b)-(g) Actual diffraction patterns from slit, at distances indicated in upper 

(h) Fraunhofer pattern. 

five wave lengths wide, for the sake of getting the figure on a 
diagram of reasonable scale. If the wave length were shorter, 
then for the same slit the distances would be stretched out to 
the right, and the Fraunhofer pattern would correspond to 
smaller angular deflections. This would be necessary to bring 



the Fresnel cases far enough from the slit so that our approxima- 
tions would be really applicable. Finally, in h, we give the 
limiting Fraunhofer pattern, not drawn to scale. 

Let coordinates in the plane of the aperture be x, y, and in 
the plane of the screen at R let the coordinates be x , y , as in 
Fig. 53. Then, if the element of area is at x, y, 0, and the point 
P at x , y , R, the distance r\ between them is 

n = VOro - x) 2 + (*/o - y) 2 + R*. 
The integration cannot be performed with this expression for r lf 
and Fresnel and Fraunhofer diffraction lead to two different 

Fig. 55.-— ri as function of xo — x: n = i/(xo - x) 2 + R*. n is the distance 
from a point of the aperture to a point on the screen; xo — x is the difference 
between the x coordinates of the points. 

approximate methods of rewriting r h leading to different methods 
of evaluating the integral. We can see the relation of these two 
methods most clearly from Fig. 55, in which n is plotted as a 
function of x — x, for the special case where y — y = 0. 
The resulting curve is a hyperbola. Now in all ordinary cases, 
R is large compared with the dimensions of the aperture. That 
is, the range of abscissas representing the dimensions of the 
aperture from (x — x\ to x — £ 2 , if x\ and x 2 are the extreme 
coordinates of the aperture), is small compared with the distance 
R, the intercept of the hyperbola on the axis of ordinates. 
The two cases are now represented by the ranges ab and cd of 
abscissas, respectively. In the first, x — xi and x — # 2 are 
separately small, as well as their difference, and this means that 
the point P is almost straight behind the aperture, in the region 


where the Fresnel diffraction pattern occurs. In the second, 
x is large, of the same order of magnitude as R, showing that 
we are examining the pattern at a considerable angle to the 
normal, as we do in the Fraunhofer case. The two approximate 
methods can now be simply described from the curve: for 
Fresnel diffraction, we approximate the hyperbola near its 
minimum by a parabola; for Fraunhofer diffraction, we approxi- 
mate it farther out by a straight line. In the first case, assuming 
R to be large compared with (x — x), we have by the binomial 

_ D 1 (so - x) 2 

or including the terms in y, 

t _ p , 1 (x - x) 2 + (y - y) 2 , 

ri - R + g — ^ + (1) 

In this case, in the notation of Eq. (8) of the previous chapter, 
we take f = R, so that r' is the remaining term of Eq. (1). 
For Fraunhofer diffraction, on the other hand, we have x > > x. 
Then we write r x 2 = (x Q 2 + y 2 + R 2 ) - 2{xx + yy ) + x 2 + y 2 , 
and we can neglect the terms x % + y 2 . If we let R 2 = x 2 + 
y 2 + R 2 } where R Q measures the distance from the center of the 
aperture to the point P, we can use a binomial expansion, 

r, - B. - xx ° + yy° ... (2) 

In this case we take f = R 0) so that / is the remaining term of 
Eq. (2). Letting x /R = I, yo/Ro = m, the direction cosines 
of the direction from the center of the aperture to P, we have 
r' = — (Ix + my) • • • , involving the position on the screen 
only through the angles, so that we see at once that the pattern 
will travel outward radially from the aperture. 

195. Fresnel Diffraction from a Slit. — Let the aperture be a 
slit, extending from x = — (a/2) to x = a/2, and from y = 
— (6/2) to 6/2. We assume a to be small, 6 comparatively 
large, as in Fig. 53, so that it is a long narrow slit. Using the 
results of Eq. (1), our integral is 

ff e - 2 ™'/*dS= f fe-™U x - x o) 2 +(v-vJ*V R *dS. 


This can be immediately factored into 

f b/2 e ->ri<x-v t )*/R*dy f a/2 e-^ x ~ x o^ R Mx. 

J -6/2 * J-a/2 

Since these two integrals are of the same form, we can treat just 
one of them. This will prove to give fringes parallel to one 
set of axes. The whole pattern is then simply the combination 
of the two sets of fringes. The single integral, for instance the 
one in x, has a real part, and an imaginary part (with sign 
changed), equal to 

f o/2 7r0r-zo) 2 , ' f a/2 • t(x - x ) 2 J ^ 

I cos — — ^r — — ax and I sin p — — ax. (3) 

J-a/2 It* J-a/2 K* 

It is customary in these integrals to make a change of variables: 
- — D = — • Then the integrals become v#A/2 times C 

and S, respectively, where C = I cos ^ u 2 du, S = I sin ^ u 2 du, 

Jui & Jui £ 

, , x — a/2 x + a/2 ™ . . 

and where U\ = — , > u 2 = — , Ihese integrals are 

VR\/2 VR\/2 

called Fresnel's integrals. They cannot be explicitly evaluated, 

but their values have been computed by series methods. 

196. Cornu's Spiral. — Let us plot the indefinite integral 

cos jr u 2 du as abscissa, I sin ^ u 2 du as ordinate, of a graph, 
o 2. jo J, 

as in Fig. 56. Then it is not hard to see that the resulting curve 
is a spiral, which is known as Cornu's spiral. To see this, we 
can first compute the slope. This is the differential of the ordi- 
nate, over the differential of the abscissa, or 

sin „^ 

= tan kW 2 . 

t „ z 

COS -jjtt 

Thus, when u 2 increases by 4, the tangent of the curve swings 
around a complete cycle, and comes back to its initial value. 
Each point of the spiral corresponds to a particular value of u. 
We can show at once that the difference of u between two points 
is simply the length of the curve between the points. We show 
this for an infinitesimal element of the curve. The square 
of the element of length, ds 2 , is equal to the sum of the squares 


of the differentials of abscissa and ordinate, or is cos 2 1 ~u 2 \du 2 + 


I ~u 2 \du 2 . 

Hence ds = du } and we can integrate to get 

s = Ui — u\. From this fact we can make sure of the spiral 
nature of the curve. For one turn of the curve corresponds to 
an increase of u 2 by 4. That is, if u', u" are the values at the two 


0.5 -- 

A h 

J7u z du 

-r— f- 

\ 1 

0.5 Jcosljiu 2 du 


Fig. 56. — Cornu's spiral. The points of the spiral marked by cross bars corre- 
spond to increments of 0.1 unit in u. 

ends, u" 2 = u' 2 + 4. This is u" 2 - u' 2 = 4, (w" - u')(u" + u') 
= 4, u" — v! = 4:/(u" + u'). The difference u" — u' is, how- 
ever, simply the length of the turn, so that we see that, as 
we go farther along, the turns become smaller and smaller, so 
that they eventually become zero, which is characteristic of a 
spiral. It is plain that the spiral is symmetric in the origin, 
having two points, for u = ± °o , for which it winds up on itself. 
Let us take our spiral, mark on it the positions u x and u 2 
corresponding to the limits of our integral, and draw the straight 
line connecting these points. The length of this line will then 


be proportional to the amplitude of the disturbance, and its 
square to the intensity. This is easy to see: the horizontal 
component of the line is just C, and the vertical component S, 
so that the square of its length is C 2 + S 2 . Knowing this, we 
can easily discuss the fluctuations of intensity, as seen in Fig. 54. 
As x Q changes, it is plain that u\ and u% increase together, their 

difference remaining fixed and equal to , ■ • Thus essen- 
tially we have an arc of this length, sliding along the spiral, 
and the intensity is measured by the square of the chord between 
the ends of this arc. Now when x is large and negative, the arc 
is wound up on itself, so that its ends practically meet, and the 
intensity is zero. This is the situation in the shadow. As x 
approaches the value —a/2, however, w 2 approaches zero, so 
that one end of the arc has reached the center of the figure. 
There are two quite different cases, depending on whether 
u 2 — Ux is large or small. If it is large (a large slit and relatively 
short distance R and small wave length), then u x will still not 
be unwound much at this point. The chord will then be half 
the value between the two end points of the spiral, and the 
intensity will be one-fourth its value without the screen, and 
will have increased uniformly in coming out of the shadow. 
As we go farther along the x direction, however, the arc will begin 
to wind up on the other half of the spiral, producing alternations 
of intensity at the edge of the shadow. Then for a while u 2 
will be nearly at one end of the spiral, U\ at the other, so that 
the intensity for some distance will be nearly constant, and the 
same that we should have without the slit. This is the illu- 
minated region directly behind the slit. Finally we approach 
the other boundary, and u\ commences to unwind. We then 
go through the same process in the opposite order. The other 
quite different case comes when w 2 — u\ is small, which is the 
case for small slit, or large wave length or distance. Then there 
is never a time when Ui is on one branch of the spiral and u% 
on the other. All through the central part o£ the pattern, 
therefore, there are no fluctuations of intensity. Such fluctua- 
tions come only far to one side or the other. They come about 
in this way: At some places in the pattern, the arc is long enough 
to wind up for a whole number of turns, and the chord is practi- 
cally zero, while at other places it winds up for a whole number 
plus a half, and the chord has a maximum. The resulting fringes 


are the Fraunhofer fringes which we shall now discuss- by a 
different method. 

197. Fraunhofer Diffraction from Rectangular Slit. — Using 
the approximation (2), our integral for Fraunhofer diffraction 

is e -2*iR 0/ X fj^iOx+my^X dS The firgt termj ag in Fregnel 

diffraction, contributes nothing to the relative intensities, and 
may be neglected. We then have ^ e 2wi{lx+my) /' K dS, as the 
integral whose absolute value measures the amplitude of the 

Let us suppose that the aperture is the same sort of rectangle 
considered above, extending from — a/2 to a/2 along x, from 
— 6/2 to 6/2 along y. Then the integral is 

J°' r t> / 2 (pTrila/\ p— irila/\\ I p irimb/\ —Kimb/\\ 
e 2 "'*A dx e 2*ir*v/\ dy = ^ — -— i 1 K - i l 
- a /2 J -6/2 2iril/^ 2Trim/\ 

_ sin (irla/\) sin (xm6/X) 

irl/\ xm/X 


The intensity is the square of this quantity. Let us consider 
its dependence on the position of the point P on the screen. 
The coordinates of this point enter only in the expressions 
I, m, showing that the pattern increases in size proportionally 
to the distance, as if it consisted of rays traveling out in straight 
lines from the small aperture, rather than having an approxi- 
mately constant size as with the Fresnel diffraction (see Fig. 
54). When we consider the detailed behavior of the intensity 
as a function of the angle, we find that this can be written as 

a 2 sin 2 (irla/\) ,. ■.■,*.„ 

/ j a /\\ 2 tunes a similar function of m, giving a curve of 

,, j. sin 2 a , irla „,, 

the form ^— > where a = — • This function becomes unitv 

when a = 0, goes to zero for a = x, 2tt, 3tt, • • • , with maxima of 
intensity approximately midway between. The maxima decrease 
rapidly in intensity. Thus at the points 3tt/2, 5x/2 . . . which 
are approximately at the second and third maxima, the intensities 
are only (2/3tt) 2 , (2/5tt) 2 , ... or 0.045, 0.016 . . . , compared 
with the central maximum of 1. Let us see how the size of 
the fringes depends on the dimensions of the slit. The minima 
come for a = rnr, or la/\ = n, I = n\/a. Thus we see that 
the greater the wave length, or the smaller the dimensions of the 
slit, the larger the pattern becomes. 



The positions of the minima can be immediately found by a 
very elementary argument. Assume for convenience that we 
are investigating the pattern at a point in the xz plane, so that 
m = 0. Then draw a plane normal to the direction I, passing 
through one edge of the aperture, as in Fig. 57. This represents 
a wave front of the diffracted wave, just as it passes one edge of 
the aperture. From the geometry of the system, this wave front 
is a distance la from the other edge, or la/2 from the middle of the 
aperture. Now, if the distance of the middle is just a whole 
number of half wave lengths different from the distance from the 
edge, the contributions of these two points to the amplitude will 

Fig. 57. — Elementary construction for Fraunhofer diffraction. 

just cancel, being just out of phase. The other points of one 
half of the aperture can all be paired against corresponding points 
of the other half whose contributions are just out of phase, finally 
resulting in zero intensity. This situation comes about when 
la/2 = wX/2, where n is an integer, or I = n\/a, the same condi- 
tion found above. Since most of the intensity falls within the 
first minimum, and since I is the sine of the angle between the 
ray and the normal to the surface, we may say that by Fraun- 
hofer diffraction the ray is spread out through an angle X/a. 

198. The Circular Aperture. — The problem of Fraunhofer 
diffraction through a circular aperture is slightly more compli- 
cated mathematically. Here we must evaluate jje 2 ^ lx+my) / x dS 
over a circle. Let us introduce polar coordinates in the plane of 
the aperture, so that x = p cos 0, y = p sin 0. Further, on 


account of symmetry, we may take the point P to be in the xz 
plane, so that m — 0. Then if p is the radius of the aperture, 

the final result is f *ddf P °e 2 * i <> cos e l ^pdp. We can integrate with 

respect to p by parts, obtaining for the integral "" 

2W T Po6 27 " P ° C ° 8 ° ^ X (e 27ri P0 0O8 e i/\ ]V 


_2iri cos ei/\ (2iri cos l/\) 2 J 

For the integration with respect to 0, it is necessary to expand 
the exponentials in series. If we do this, the integrals are in 
each case integrals of a power of cos 0, from to 2x. These are 
easily evaluated, and the result, combining terms, proves to be 

H. 1 "" Kt) + \\y\) - 1(it) + b(it) } where 

k is an abbreviation for irp l/\. If we recall the formulas for 
BesseFs functions, we can see without difficulty that this is equal to 

-y-Ji( 27rporr- )• It is not hard, using some of the properties of 

BesseFs functions, to prove this formula directly, without the 
use of series. From the series, we see that the intensity has a 
maximum for I = 0, the center of the pattern. As I increases, 
we can see the behavior most easily from the expression in terms 
of BesseFs functions. Since J\ has an infinite number of zeros, 
there are an infinite number of light and dark fringes. The 
first dark band comes at the first zero of J h which from tables 
is at 2t Po I/\ = 1.2197tt, l Po /\ = 0.61. The next is at Po l/\ = 
1.16, and so on, with maxima between. We see that, except for 
a numerical factor, the pattern from a circular aperture has about 
the same dimensions as that from a square aperture. Thus if 
the side of the square were equal to the diameter .of the circle, 
2p , the first dark fringe would be at 2 p l/\ = 1, p l/\ = 0.5, and 
the next one at 1.0. 

199. Resolving Power of a Lens. — Whenever light passes 
through a lens, it is not only refracted, but it has passed through a 
circular aperture, the size of the lens itself or of the diaphragm 
which stops it down, and as a result it is diffracted. Suppose, 
for example, that the lens is the objective of a telescope, and that 
parallel light falls on it, as from an infinitely small or distant 
star. Then after passing through the diaphragm, the light will 
no longer be a plane wave, but will have intensity in different 
directions, as shown in the last section. The central maximum 


will have an angular diameter of 0.61 \/p , where p is now the 
radius of the telescope objective. The resulting waves are just 
as if the light came from an object of this diameter, but passed 
through no diaphragm. When the telescope focuses the radia- 
tion, the result will be not a single point of light, but a circular 
spot surrounded by fringes, as of a star of finite diameter. For 
this reason, the telescope is not a perfect instrument, and one 
would say that its resolving power was only enough to resolve 
the angle 0.61 X/p . This is usually taken to mean the following : 
if two stars had an actual angular separation of this amount, the 
center of the image of one star would lie on the first dark fringe 
of the other, and the patterns would run into each other so that 
they could be just resolved. We see that the larger the aperture 
of the telescope, or the smaller the wave length, the better is the 
resolution. The same general situation holds for microscope 

200. Diffraction from Several Slits; the Diffraction Grating. — 
Suppose we have a number N of equal, parallel slits, equally 
spaced. Let each have the width a along the x axis, and let the 
spacing on centers be d, so that the centers come/at x = 0, d • • • 
(N — l)d. Now let us find the Fraunhofer pattern. The part 
of the integral depending on y will be just as with the single slit, 
and we leave it out of account. We are left with 

f /2 ^/Nfe + f d+a/2 e^^dx + • • • + CT****"****- 

J -a/2 Jd-a/2 J(N-l)d-a/2 

But this is, as we can immediately see, simply 

a / 2 e 2lrilx ^dx(l + g2ir*ta/X l & 2vil2d/\ _|_ . . . _|_ e 2inHN~\)d/\\ _ 

By the formula for the sum of a geometric series, this is J_ a/ e 2vilx/x 

(1 _. e 2irilNd/\\ 
2^Wx" )" ke* the first term be A, the amplitude due 

to a single slit, which we have already evaluated. Now to find 
the intensity we multiply this by its conjugate, which gives 

2 1 - cos {2irlNd/\) = . 2 sin 2 (rlNd/\) . . 

A 1 - cos (2ttW/X) sin 2 {irld/\) ' W 

That is, with N slits the actual intensity is that with one slit, 
but multiplied by a certain factor. This factor goes through 
zero when lNd/\ is an integer, so that I equals an integer multi- 
plied by \/Nd. This gives fringes with a narrow spacing, charac- 


teristic of the whole distance Nd occupied by the set of apertures, 

crossing the other pattern, and they are what are usually called 

interference fringes, since they are due, not to diffraction from 

a single aperture, but to interference between different apertures. 

But in addition to this, the denominator results in having these 

fringes of different heights. .The minimum height occurs when 

the denominator equals unity, when the fringes are of height A 2 , 

and the most intense fringes come when the denominator is zero. 

Here the ratio of numerator to denominator is evidently finite, 

and gives fringes of height N 2 A 2 . Thus the greater N is, the 

greater the disparity in height between the largest and smallest 

maximum. Evidently every iVth maximum will be high, and 

the high ones will be spaced according to the law ld/\ = k, an 


Now suppose N becomes very great, as in a diffraction grating. 

Then the small maxima will become so weak compared with the 

strong ones that only the latter need be considered. The latter 

will seem to consist of a set of sharp lines, with darkness between. 

These sharp lines come, as we have seen, at angles to the normal 

given by k\ = d sin 0, where k is an integer, and sin — I. This 

is the ordinary diffraction grating formula, where k is for the 

central image, 1 for the first-order spectrum, 2 for the second 

order, etc. But we cannot entirely neglect the fact that thSre are 

other small maxima near the important ones. Thus for ld/\ = 

k, the intensity is N 2 A 2 . This comes for lNd/\ = Nk. But 

for IN d/\ = Nk + %, we again have a secondary maximum, whose 

A 2 A 2 A 2 
height is now ^ = 71 r= — r-^-r, = -. — - . 

sm T sm x NT^ sm \ k + 2NJ 

Now sin 2 (^ + 9^) = (oXf) approximately, if iV is large, 

so that the height of the maximum is 4N 2 A 2 /9t 2 , or about 0.045 
of the height of the highest maximum. Thus the first few second- 
ary maxima cannot be neglected. To get an idea of the width of 
the region through which the intensity is considerable, we may 
take the width of the first maximum. From the center to the 
first dark fringe, this is given by the fact that at the center 
lNd/\ = Nk, at the dark fringe = Nk + 1, so that Al = \/Nd. 
This is closely connected with the resolving power of a grating. 
For a single frequency gives not a sharp set of lines, one for each 
order, but a set broadened by the amount we have found. Thus 


two neighboring frequencies, differing by AX, could not be resolved 
if the first minimum of one lay opposite the maximum of the 
other. Since Z = \k/d, this would be the case if Al = AX/b/rf = 
\/Nd, or if AX/X = 1/Nk. The resolving power thus increases 
as the number of lines in the grating increases, and as the order 
of the spectrum increases. 


1. Carry through a discussion of Fresnel diffraction from a slit, when the 
source is at a finite distance, directly behind the center of the slit. In what 
ways will the result differ from the case we have discussed? 

2. Light of wave length 6,000 A. falls in a parallel beam on a slit 0.1 mm. 
broad. Work out numerical values for the intensity distribution across the 
slit, at three distances, first, in which the Fresnel fringes are small compared 
with the size of the pattern, second in which they are of the same order of 
magnitude, and third, in which they are Fraunhofer fringes. Either con- 
struct Cornu's spiral yourself, from tables of Fresnel's integrals, or use the 
one of Fig. 56. 

3. Find the coordinates of the points at which Cornu's spiral winds up on 
itself. From the chord between these points, compute the intensity behind 
an infinity broad slit, which essentially means no slit at all. Find whether 
this agrees with what you should expect it to be. 

4. Prove that the maxima of the function 

sin 2 (irla/X) sin 2 a 

(xZa/X) 2 ~ a. 

are determined by the equation a. = tan a. Find the first three solutions 

of this transcendental equation and compare them with the approximate 

solutions a = 3ir/2, 5ir/2, 7ir/2. 

5. Discuss the Fresnel diffraction pattern caused by an edge coincident 
with the y axis, the screen occupying one-half the xy plane. The diffraction 
pattern is obtained in a plane parallel to the xy plane and a distance R from 
it. Plot the variation of intensity of light along the x direction from a 
region inside the shadow to well into the directly illuminated area. Prove 
that the intensity, of light just at the edge of the geometrical shadow is 
one-fourth of its value if there were no diffraction edge. 

6. Evaluate the Fresnel integrals f" cos ^uHu and i sin ^uHu in a power 

series. What is the range of convergence of these series? 

7. Evaluate the Fresnel integrals in series of the form 

cos 2 =«Si + sin 2 2 mS 2, 
where Si and S 2 are power series in u. What is the range of convergence 
of these series? 

8. Find a semiconvergent series for the Fresnel integrals of the same form 
as in Prob. 7 where the power series are now in inverse powers of u. (Hint: 

Write f °° cos xHx = f °° x cos x 2 — and integrate by parts, repeating the 

process.) Calculate the remainder in these series after the nth term. Show 
that this is smallest when n is about x 2 /2. 



The beautiful success of the wave theory in explaining diffrac- 
tion patterns, which we have been discussing in the last chapter, 
has been the best proof of the correctness of this theory. But 
the proof has not always gone unchallenged. Ever since the 
time of Newton, at least, there has been a rival theory, the cor- 
puscular theory. Newton imagined 1 ght to consist of a stream 
of particles. These particles, or corpuscles, traveled in straight 
lines in empty space, and were reflected by mirrors as billiard 
balls would be by walls, making equal angles of incidence and 
reflection. Refraction was explained by supposing that different 
media had different attractions for the corpuscles. Thus glass 
would attract them more than air, the potential energy of a 
corpuscle being constant within any one medium, but being lower 
in glass than in air, so that the corpuscles would have a normal 
component of acceleration toward the glass, without correspond- 
ing tangential acceleration, and would be bent toward the normal 
on entering the glass. By working out this idea, the law of 
refraction easily follows. Newton was aware of the wave theory • 
Huygens was advocating it at the time. But his objection was 
that light travels in straight lines, whereas the waves he was 
familiar with, waves of sound or water waves, certainly are bent 
out in all directions on passing through apertures. Newton 
considered this to be a fatal objection to the wave theory. 

The answer to this objection, of course, came later with the 
quantitative investigation of diffraction. In the preceding 
chapter, we have seen that a plane parallel wave, falling on a small 
aperture of dimension a, does not form a perfectly parallel ray 
after emerging from the hole. On the contrary, it spreads out, 
first by forming fringes on the edges of the ray (Fresnel diffrac- 
tion), then at greater distance by developing a conical form/with 
definitely diverging rays (Fraunhofer diffraction). The angle 
of this cone is of the order of magnitude of X/a, where X is the 



wave length. Newton was tacitly assuming that the wave 
length, as with sound, was large, that X/a would be large for a 
small slit, and there would be large spreading out and a com- 
pletely undefined ray. But it was found early in the nineteenth 
century that the wave length was really so small that, with 
apertures of ordinary size, we can neglect diffraction, and obtain 
an almost perfectly sharp ray, a band of light separated from the 
darkness by sharp, straight edges. 

201. The Quantum Hypothesis. — More recently, in the present 
century, a more serious argument for a corpuscular theory has 
appeared. This is the hypothesis of quanta, originated by Planck 
in discussing the radiation from a heated black body. The most 
graphic application of this hypothesis was made by Einstein to 
the theory of the photoelectric effect. It is known that light of 
frequency v, falling on a metal surface, liberates electrons, as for 
example in the photoelectric cell. Now the law of emission is 
remarkable: the energy of each emitted electron, independent 
of the intensity of the light, is a definite amount proportional to 
the frequency, hv, where h is Planck's constant, equal to 6.54 X 
10- 27 in c.g.s. units, introduced by him in his first discussion. 
This energy of the emitted electron is really decreased by the 
amount of energy it loses in penetrating the surface, so that 
hv will act as a maximum energy, rather than the energy of each 
electron. Of course, the total emission is proportional to the 
intensity of the light, but increasing the intensity increases the 
number of electrons, not their energies. 

Einstein's hypothesis to explain the photoelectric effect was 
that the energy of the wave was not to be computed in a continu- 
ous manner by Poynting's vector, but that it was localized in 
little particles or corpuscles (now called photons), each of energy 
hv. Then it would be perfectly obvious that if no photon fell on 
a spot of the metal, no electron would be ejected; but that a 
photon which happened to fall on a given place would transfer 
all its energy to an electron, being absorbed, and ceasing to exist 
as light. The intensity of light would be measured simply by 
the number of photons crossing an arbitrary surface per second, 
times the energy carried by each photon. 

Einstein's hypothesis found many supports. One of these 
comes from the structure of atoms. Atoms emit monochromatic 
spectrum lines, falling often into regular series. Bohr was able 
to explain this, at least in hydrogen, the simplest atom, by assum- 


ing that the atom was capable of existing only in certain definite 
stationary states, each of a definite energy. He supposed that 
radiation was not emitted continuously, as the electromagnetic 
field from a rotating or vibrating particle would be, but that the 
atom stayed in one energy level until it suddenly made a jump 
to a second, lower, level, with emission of a photon. If the higher 
energy is E 2 , the lower E 1} the energy of the photon would be 
E 2 — Ei, so that its frequency would be E 2 /h — Ei/h. This 
formula has proved to be justified by great amounts of experi- 
mental material. First, it states that the frequencies emitted by 
atoms should be the differences of "terms" E/h, each referring 
to an energy level of the atom. This is found to be true in spec- 
troscopy, and has been the most fruitful idea in the development 
of that science. Even tremendously complicated spectra can 
now be analyzed to give a set of terms, and the number of terms 
is much less than the number of lines, since any pair of terms, 
subject to certain restrictions, gives a line. But also, Bohr was 
able to set up a system of mechanics to govern the hydrogen 
atom, very simple in its fundamentals, though different from 
classical mechanics, which gives a very simple formula for the 
energy levels, agreeing perfectly with the extremely accurate 
experimental values. Bohr's idea of stationary states, in turn, 
was tested by experiments on electron bombardment. It was 
found that an atom in state of energy Ei could be bombarded 
by an electron. If the electron's energy, as determined from the 
electrical difference of potential through which it had fallen, was 
less than E 2 — Ei, where E 2 is the energy of the upper state (we 
consider only one), it would bounce off elastically, without loss 
of energy. But if its energy was E 2 — E lf or greater, it would 
often raise the atom to the upper state, which could be proved 
by subsequent radiation by the atom, and would lose this amount 
of energy itself. This definitely verified the existence of sharp 
energy levels in the atom. At the same time, it furnishes an 
example of a very interesting phenomenon. An electron bom- 
bards an atom, loses energy E 2 — Ei. This energy is emitted 
as a photon hv. The photon falls on a metal, is absorbed, ejects 
a photoelectron of energy E 2 — Ei (minus a little, for the work 
of coming through the surface). The photoelectron bombards an 
atom, loses its energy, which goes off as a photon. Energy, in 
other words, passes back and forth from electrons to photons 


indiscriminately. If electrons are particles, surely photons are 

202. The Statistical Interpretation of Wave Theory. — All 

these phenomena suggesting photons, and a corpuscular structure 
for light, must not cause one to forget that light still shows inter- 
ference, and that the arguments for the wave theory are as strong 
as ever. Various attempts were made to set up laws of motion 
for the photons, which would lead to the correct laws of interfer- 
ence and diffraction (Newton had already done it for refraction), 
but without success. We can see easily why this should be so. 
Consider very weak light, so weak that we only have a photon 
every minute, for example, going through a diffraction grating. 
Such weak light, we know experimentally, is diffracted just like 
stronger light. But that means, as we saw in the last chapter, 
that the resolving power depends on a cooperation of the whole 
grating; if half of it were shut off, its resolving power would be 
decreased, and the intensity distribution changed. Even the 
single photon shows evidence of the full resolving power, in that 
if we make a large enough exposure to have many photons, so 
that we can develop the photograph and measure the blackening, 
which surely measures the number of photons which have struck 
the plate, we find the full resolving power of the grating in the 
final photograph. But it is difficult to imagine any law of motion 
of a photon which will depend on rulings over the whole face of a 
grating, if the photons went through only one point of it. 

After such difficulties, the theory that has emerged is a com- 
bination of wave theory and corpuscular theory. It is assumed 
that atoms emit wave fields as in the electromagnetic theory, 
emitted by certain oscillators connected with the atom, and 
vibrating with the emitted frequencies. These waves do not 
carry energy, but serve merely to determine the probable motion 
of the photons. The rate of emission of waves by the oscillator 
determines the probability of emission of photons. The Poyn- 
ting's vector at any point of the radiation field determines the 
probability that a photon will cross unit cross section normal to 
the radiation, per second. If the oscillator is damped with time, 
that indicates that the probability of emission of a photon 
decreases with time; that is, that the probability that the atom is 
in its upper, excited state, from which it could emit the radiation, 
is decreasing with time. One can carry such a probability con- 
nection through in detail. 


Probably the most graphic picture of the probability relation 
between photons and waves is obtained if we imagine very weak 
light, in which photons come along one in several seconds, forming 
a diffraction pattern. The diffraction pattern is assumed to be 
on a screen which is capable of registering the individual photons 
as they come along. This screen might be a photographic plate, 
in which a single photon is enough to make a grain developable, 
or it might be a screen having slits opening into Geiger counters 
or other devices for registering individual photons. Of course, 
the only way of detecting that there was light falling on the 
screen would be to detect the photons. First, one photon would 
strike the screen, in one spot, then another photon in another 
spot, and so on. So long as there were only a few photons, the 
arrangement might seem to be haphazard. But as more and 
more photons were present, we could find where they were densely 
distributed, and where there were only a few. It would then 
prove to be the case that the places where photons were dense 
were just those places where the wave theory predicted a large 
intensity, and the places where there were no photons were those 
where the wave theory indicated darkness. 

203. The Uncertainty Principle for Optics. — It is characteristic 
of the theory that no law of motion of photons is assumed beyond 
this probability; according to the present view, no such detailed 
laws exist. Given a plane monochromatic wave of light, we 
know exactly the energy of each photon (hv), and its momentum 
(this' proves to be hv/c = h/\, pointing in the direction of the 
wave normal), but, if the intensity is uniform over space, we have 
no information as to the position of the photon. If we let the 
plane wave fall on a slit of width a, the light passing through will 
be more defined as to its position in space. It will be in the form 
of a small ray or beam, spreading by diffraction, but still, in the 
region of Fresnel diffraction, of width approximately a. Thus, 
if x is the coordinate along the wave normal, y the coordinate at 
right angles, the photon will surely be in a beam whose length 
along the x axis is infinite, but of width only about a along the 
y axis, as in Fig. 58. That is, the uncertainty in the y coordinate 
has been reduced to a: Ay = a, if Ay is the uncertainty. At the 
same time, however, a, compensating uncertainty in the momen- 
tum has appeared. The wave is now spreading, the wave nor- 
mals making angles up to about X/a with the x axis, as shown 
in Sec. 197. Thus, if the whole momentum remains p = h/\ 



this will have a component along y, equal to p times the sine of 
the angle between the momentum and the x axis, or approxi- 
mately p\/a = h/a. But we do not know which angle, up to the 
maximum, the actual deviation will make, for all we know is that 
the photon is somewhere in the diffraction pattern. Hence the 
uncertainty in y momentum is of this order of magnitude of 
h/a. If we call it Ap y , we have the relation 

AyAp y = ?- = h. {1) 

This is an example of the uncertainty principle, concerning the 
amount of uncertainty inherent in the description of the motion 

Fig. 58. — Uncertainty principle in diffraction through slit. 
Ap __X_ 

V ~ &q 
(Compare Fig. 54, top diagram). 

— = — , ApAg = Ap = 

of photons by the probability relations with wave theory. 
Further examination indicates that this law is very general: 
where a beam is limited to acquire more accurate information 
about the coordinates of the photon, we make a corresponding 
loss in our knowledge as to its momentum, and vice versa. 

A similar relation holds between energy and time. Suppose 
we have a shutter over our hole, and open it and close it very 
rapidly, so as to allow light to pass through for only a very short 
interval of time At. Then the wave on the far side is an inter- 
rupted sinusoidal train of waves, and we know by our Fourier 
analysis, as in Sec. 185, that the frequency is no longer a definitely 
determined value, but is spread out through a frequency band 
of breadth Av % given by Av/v = 1 /(number of waves in train). 


Now the number of waves in the train is cAt, the length of the 
train, divided by X. Hence Av/v = \/{cAt), AvAt = 1. Using 
E = hv, we have 

AEAt = h, (2) 

an uncertainty relation between E and t, showing that energy 
and time are roughly equivalent to momentum and coordinate: 
if we try to measure exactly when the photons go through the 
hole, their energy becomes slightly indeterminate. Further, 
here we know that the x coordinate is now determined, at any 
instant of time, with an accuracy cAt: the photon must be in 
the little puff of light, or wave packet, sent through the pinhole 
while the shutter was open. Thus Ax = cAt. But now the x 
component of momentum, which to the first order is the momen- 
tum itself, is uncertain. For p x = p = — > Ap x = -Av = 

h/(cAt) = h/Ax, so that 

AxAp x = h, (3) 

again the uncertainty relation. We can, in other words, make 
our wave packet smaller and smaller, until it seems almost like 
a particle itself, and its path is the path of the photon. The 
wave packet will be reflected and refracted, just as large waves 
would be, giving the laws of motion of photons in refracting 
media. But if we try to go too far, making the wave packet 
too small, we defeat our purpose, and make it spread out by 
diffraction. We cannot, that is, get exactly accurate knowledge 
about the laws of the photon's motion from the probability 
relation. In some cases, this is even more obvious than here. 
Thus, if a wave packet is sent through a diffraction grating, it 
will spread out much as a plane wave would, into the various 
orders of the diffraction pattern. We cannot, then, make any 
prediction at all, except a statistical one, as to which order 
of the pattern a given photon will go to. We completely lose 
track of the paths of individual photons in a diffraction pattern. 
204. Wave Mechanics. — It is now a remarkable fact that many 
indications point out that there is the same dualism between 
waves and particles in mechanics that there is in optics. We have 
seen one in the way energy passes from electrons to photons, 
and back again. We can paraphrase our earlier remark by 
saying that surely if photons are connected with waves, electrons 
are connected with waves too. But there are more substantial 


reasons. In discussing the statistical relation of waves and 
photons, we mentioned that the electromagnetic waves were 
produced by oscillators, and it appears that these oscillators 
have only a statistical relation to the atoms. Thus we noted 
that the oscillators connected with radiating atoms would be 
exponentially damped, while the atoms were discontinuously 
jumping from an excited state to a lower state from which they 
did not radiate. This suggests a statistical connection between 
the oscillators and the atoms or electrons, the number of atoms 
in the excited state at any instant being related to the instan- 
taneous amplitude of the corresponding oscillators, as the number 
of photons is related to the amplitude of the electromagnetic 
wave. But there are two compelling reasons which have led 
to the acceptance of the connection between the motion of 
particles and waves. The first was the experimental proof, 
by Davisson and Germer, G. P. Thomson, and others, that 
electrons can show the same sort of diffraction effects that light 
shows, being diffracted by crystals, and even by ruled gratings. 
The second was the fact, discussed by de Froglie and developed 
by Schrodinger, that the stationary states of atoms and molecules 
correspond to the various overtones of a standing wave system. 
Thus. the waves associated with particles not only can have 
progressive form, connected with particles traveling along, but 
can also exist as standing waves, and these are precisely the 
oscillators which are statistically connected with the atoms, 
and which represent the stationary states of Bohr's theory. 
We shall elaborate the theory of these stationary states in 
succeeding chapters. 

It is definitely settled, then, that mechanics is just as much 
a wave phenomenon as optics is. The wave mechanics leads 
to Newtonian mechanics as a limiting case, just as the wave 
theory of light leads to geometrical optics, where one treats 
rays only, and where one can assume that the light consists of 
particles following fixed paths and moving according to fixed 
laws. Our work, so far in this book, has been divided roughly 
into two sections, mechanics, and the electromagnetic theory 
and optics. We now commence a third section, of equivalent 
importance, on wave mechanics. But as the standing waves 
of wave mechanics are often the atoms themselves, it is natural 
that our treatment should be intimately bound up with the struc- 
ture of matter, a subject which one can mostly leave out in 


speaking of mechanics or optics, but which is of the very essence 
of the problem with wave mechanics. 

205. Frequency and Wave Length in Wave Mechanics. — If 
we are considering a mechanical particle of energy E, momentum 
p in a given direction, we assume that associated with it is a wave 
(of course, not a light wave or a vibrational wave of a material 
medium ; we are now accustomed in physics to the idea of purely 
mathematical waves, without reference to any medium) whose 
frequency v and wave length X are given by the equations 

E = hv, p = * (4) 

the wave normal being in the direction of motion of the particle. 
The reason why one ordinarily is not conscious of the wave 
nature of mechanics is the extraordinarily small wave length 
involved. A particle of mass 1 gm., moving with velocity 
1 cm. per second, 'has a wave length given by h/\ = mv = 1, 
X = h/1 = 6.54 X 10 -27 cm., exceedingly small compared with 
all ordinary dimensions. If such a particle passed through a 
pinhole, the corresponding wave would be diffracted, but the 
angle of spreading would be extremely small. With other 
magnitudes for the mass, however, the diffraction effect can 
become important. Thus an electron, of mass 9 X 10 -28 gm., 
moving, for example, with a velocity of 10 8 cm. per second, has a 

wave length of 9 x 10 - 28 x 1Q8 = 7.3 X 10" 8 cm, a quantity 

of atomic dimensions. Thus if the electron passed through an 
aperture of atomic size, as a hole between atoms, it could be 
diffracted through a large angle. It is then evident that diffrac- 
tion of electrons on an atomic scale is important; in fact, we 
shall see in the next chapter that this is just why the atomic scale 
is what it is. 

206. Wave Packets and the Uncertainty Principle. — Just as 
with light, we assume a statistical relation between the intensity 
of the wave and the probability of finding the particle at the 
corresponding point. A uniform infinite monochromatic plane 
wave corresponds to a particle traveling with a definite energy 
and momentum in a definite direction, but whose position is 
entirely unknown. Such a mechanical system would be approxi- 
mated by electrons which had been all accelerated to the same 
speed in a vacuum tube, but whose individual positions we did not 


know. If we wished to fix the positions, we could let the beam 
of electrons fall on a screen containing a pinhole. Then any 
electron found on the far side would have gone through the pin- 
hole, so that we would know its y coordinate with an uncertainty 
Ay (using the same coordinates as with the optical case, x normal 
to the screen, y in the plane of the screen). After passing 
through, the electrons would travel practically in a straight 
line; but the ray will be deviated on account of diffraction, and 
since the law of motion of the electron is not definitely fixed, 
but is merely a probability law connecting it with the wave, 
there will be an uncertainty in its y momentum, given by AyAp y = 
h. Similarly if we try to determine the x coordinate of the 
electron by opening and closing a shutter, so that we know exactly 
when it went through the hole, we thereby introduce a broaden- 
ing into the spectrum of the wave, hence an uncertainty in wave 
length of the particle, and finally in its x component of momen- 
tum, given by AxAp x = h. Thus the principle of uncertainty 
operates with particles as with photons. 

The wave packet, as set up in this way, may be made extremely 
small without diffraction, if the wave length is as small as it often 
is. Thus with a particle of the mass of familiar objects, the wave 
function representing the motion of its center of gravity can be 
concentrated in a region much smaller than atomic dimensions, 
without being troubled by diffraction. This packet would then, 
in a force field, travel around in a certain way without appreciable 
spreading. We know at each instant that the particle is within 
the packet. Thus for all practical purposes the law of motion 
of the packet is the same as the law of motion of the particle. 
This then is the direction in which we look for the derivation 
of Newtonian mechanics from wave mechanics. We at once 
see that the motion of a wave packet in mechanics will be more 
complicated than in optics, for the wave length in mechanics, 
X = h/p, changes continuously from place to place. If we 
have a conservative motion, for which alone it is easy to formu- 
late wave mechanics, we have p 2 /2m + V = E, X = h/p = 
h/y/2m{E — V), a function of position on account of V. E 
stays constant, as usual, so that the frequency is constant, as 
in optics. But the variable X corresponds to a variable index 
of refraction. There are only a few optical cases where this is 
true. Generally the index change's sharply from one medium 
to another, and the ray of light consists of segments of straight 


lines. In refraction by the atmosphere, however, as in astron- 
omy, or in the refraction by heated air over the surface of the 
earth, as in mirages, the path of the light rays is curved instead 
of sharply bent, and this corresponds to the usual mechanical 
case, where the paths or orbits are curved. To proceed further 
with the connection between wave mechanics and Newtonian 
mechanics, we must first investigate the shape of a ray in a case 
where the index changes with position. The general principle 
governing this is called Fermat's principle. 

207. Fermat's Principle. — Assume that we have an optical 
system, with a ray traveling from Pi to P 2 . We may start the 
ray by letting parallel light fall on a pinhole, so that really the 
light travels in a narrow beam, eventually reaching P 2 . We 
assume that the dimensions are so large that diffraction can 
be neglected. Then suppose we compute the time taken for 
light to pass from the point Pi to P 2 along the actual ray. This 

Jrp* fa 
— > where the integral is a line integral, com- 
Pi v 
puted along the ray from Pi to P 2 , ds is the element of length 
along the ray, and v is the velocity, a function of position if 
the index of refraction changes from point to point. Next, 
suppose that we compute the same integral for other paths 
joining P x and P 2 , but differing in between. Since in general 
the integral is not independent of path, we shall get different 
answers. In general, if we go from one path to another, the 
difference of the integral between the paths will be of the same 
order of small quantities as the displacement of the path. But 
Fermat's principle says that if one path is the correct ray, and 
the other is slightly displaced from it, the difference in the integral 
is of a higher order of small quantities. This is a sort of condi- 
tion met in the calculus of variations. In that subject we have 

J'»i > 2 ^ 
— is the variation 
Pl V 
of the integral, and it means the difference between the integral 

over one path, and over another infinitely near to it. Fermat's 
principle says that the variation of the integral is zero for the 
actual path; meaning that the actual variation is infinitesimal 
of a higher order than the variation of path, so that it vanishes 
in the limit of small variation of path. The idea of the varia- 
tion of an integral is closely analogous to that of the differential 
of a function in ordinary calculus. Thus, if the variation of an 



integral is zero, for a given path, that means that the integral 
itself is a maximum or minimum with respect to variations of 
path; or, more generally, that it is stationary, not changing with 

small variations of path. Set- 
ting the variation equal to zero 
corresponds to setting the deriv- 
ative of a function equal to 
zero in calculus. 

Let us verify Fermat's prin- 
ciple in two simple cases. 
First, we assume that v is every- 
where constant, so that there 
are no mirrors or lenses. Then 
we can take v outside the 
integral, dividing through by it, 

and having 5 1 ds = 0. That 

is, the true path of light between 
Pi and P 2 is that line which has 
minimum (or maximum) length, 
and j oins Pi and P 2 . Obviously 
the minimum is desired in this 
case; and the shortest line 
between Pi and Pi is a straight 
line, which then is the ray. Let 
us compute the variation of 
path, to check the variation 
principle. In Fig. 59 (a), we show the straight line joining Pi and 
P 2 , and also a varied path, Pi#P 2 . The length of this second path is 


Fig. 59. — Variation of length of path. 
(a) The straight line P1AP2 differs in 
length from the varied path P1BP2 by a 
small quantity of the order of the square 
of AB. 

(b) The broken line P1AP2 differs 
from P1BP2 by a quantity of the order of 
AB itself. Hence the straight line of 
(a), rather than the broken one of (0), 
is the one for which the variation of 
length is zero. 

2V(PiAy + (AB)" = 2(iM) 

1 + i 

(AB) 2 
2 (PiAY 


= (PiP.) 

_j_ 2 ,p p v differing from the direct path P X P 2 by an infinitesimal 

of the second order, if (AB), the deviation of the path P X BP 2 
from P1AP1, is regarded as small of the first order. In other 
words, the path PivlP 2 satisfies the condition that the variation 
of its length is zero (that is, small of the second order). On the 
other hand, if we started with a crooked path, as P1AP2 in (b), 
then the path PiPP 2 differs from it approximately by the amount 
(5(7) _|_ (BD), or approximately 2 (AB) sin 6, an infinitesimal 


of the same order as (AB), so that in this case the variation is not 
zero, and the crooked path is not the correct one. 

As a second example, we take the case of reflection. In Fig. 
60, consider the path PiAP 2 , connecting P x and P 2 , satisfying the 
law of reflection on the mirror OA. This path evidently equals 
PiAP 2 in length, where Pi' is the image of P lm Similarly a 
slightly different path PiPP 2 equals Pi'PP 2 , which is therefore 
longer, since PiAP 2 is the straight line connecting P/ and P 2 . 
In other words, PiAP 2 makes the integral a minimum, and is the 
correct path. In this case we *? 
could again easily show that the 
integral along PiBP 2 differed 
from that along P1AP2 by 
quantities in the square of AB, 
verifying our statement that if 
the path is displaced by small 
quantities of the first order 
(AB) the integral is changed 
only in the second order (AB 2 ). 
A similar proof can be carried 
through for the case of refrac- 
tion, showing that the law of jf 
refraction is given by Fermat's fig. 60.— Fermat's principle for 

principle. reflection. The path P1AP2, equal to 

. . Pi'APi, differs in length from its neigh- 

A lundamental proof Of bor PiBPz by a small quantity of the 

Fermat's principle can be given order of the square of AB - 
directly from the determination of the ray from diffraction theory. 
The condition that a point P 2 lie in the ray, if we discuss diffraction 
through the aperture by Huygens' principle as in the last chapter, 
is that the various paths leading from Pi to P 2 , by going to various 
points of the aperture, and then being scattered in Huygens' 
wavelets from there to P 2 , should be approximately the same, so 
that the light can interfere constructively at P 2 . This means 
that such paths, as measured in wave lengths, are all approxi- 
mately the same length. In other words, for constructive inter- 

C P3 ds 
ference, I — , the number of wave lengths between Pi and P 2 , 

JPi a 

C Pi ds 
must be independent of slight variations in the path, or 8 I — = 0. 

JPi ^ 
This clearly is the condition whether X is independent of position 



or not, for, even if the waves change in length from point to point, 
we must still have the waves interfere to get the ray, and this 
still demands the same number of wave lengths along neighboring 
paths. Now X = v/v, and since v, the frequency, is a constant 
throughout the path of the light, we may then write the varia- 

C Pi ds 
tion as v 8 I — = 0, from which, dividing by v, we have Fermat's 

JPi v 

principle. This interpretation in terms of the interference of 
the waves along the ray is the fundamental meaning of Fermat's 

208. The Motion of Particles and the Principle of Least Action. 

We shall now show that if we use the analogue to Fermat's prin- 
ciple in mechanics, it leads to the correct motion of the particle 
according to Newtonian mechanics. As we have seen, the wave 
problem representing the motion of a single particle whose vari- 
ables we know is a ray. And the path of this ray is given by 
Fermat's principle, which we may write in the form 8jds/\ = 0. 
But now in wave mechanics, h/\ = p, the momentum, so that, 
canceling out the constant factor h, this becomes Sjp ds = 0. 
But this is a well-known equation of ordinary mechanics: the 
integral jp ds, or Jp dq, if q is the coordinate in a one-dimensional 
motion, is called the action, and the principle Sfp dq = 0, showing 
that the action is a maximum or more often a minimum, is called 
the principle of least action. And by the calculus of variations 
we can show that the principle of least action leads to Lagrange's 
equations, as the equations giving the motion of a particle which 
obeys the principle. This principle, or a closely related one called 
Hamilton's principle, also stated in terms of the calculus of varia- 
tions, is often considered a fundamental formulation of the whole 
of mechanics, more fundamental than Newton's laws of motion, 
since these, in the form of Lagrange's equations, follow from it. 
As a matter of fact, the derivation of Lagrange's equations from 
the variation principle is the simplest way of deriving them, for 
one familiar with the calculus of variations, and leads to the 
equations directly in any arbitrary coordinate system. But 
here we have gone even farther : we have sketched the derivation 
of the principle of least action from wave mechanics, as the law 
giving the shape of a ray, determined from interference of the 
waves. As we see from this, wave mechanics is the fundamental 
branch of mechanics, and ordinary Newtonian mechanics, the 
mechanics of particles, is derived from it. 




1. Assume in Fig. 61 that POP' is the path of the optically correct ray 
passing from one medium into a second one of different refractive index. 
Prove Fermat's principle for this case, showing that the time for the ray to 
pass along a slightly different path, as PAP', differs from that along POP' by 
a small quantity of higher order than the distance AO. The figure is drawn 
so that AB, CO, are arcs of circles with centers at P and P', respectively, 
and it is to be noted that for small AO, the figures AOB, AOC, are almost 
exactly right triangles. 

Fig. 61. — Fermat's principle for refraction. 

2. An electron of charge e = 4.774 X lO" 10 electrostatic units falls 
through a -difference of potential of V volts (1 volt = 1/300 e.s.u.) and 
bombards a target, converting all its energy into radiation, which travels 
out as one photon. Using the relations that the energy of the photon = 
hv, v = c/\, where c, the velocity of fight, is 3 X 10 10 cm. per second, find 
the wave length of the resulting radiation. Find the number of volts neces- 
sary to produce visible light of wave length 5,000 A. (1 A. is 10 -8 cm.); 
x-rays of wave length 1 A. ; gamma rays of wave length 0.001 A. 

3. Assume that light falls on a metal and ejects photoelectrons, the energy 
required to pull an electron through the surface being at least 2 volts. Find 
the photoelectric threshold frequency, the longest wave length which can 
eject electrons, remembering that the long wave lengths have small photons 
which have not enough energy. Discuss the effect of work function (the 
energy required to pull the electron out) on photoelectric threshold. 

4. Newtonian mechanics becomes inaccurate when the wave length of the 
particle becomes of the same order of magnitude as the dimensions involved. 
Consider the accuracy of Newtonian mechanics in the problem of an electron 


in an atom. Assume for purposes of calculation that the electron moves 
in a circular orbit of radius 0.5 A., with an angular momentum h/2ir (deter- 
mine its speed, and hence wave length, from this fact). 

5. Consider as in Prob. 4 the accuracy of Newtonian mechanics for a 
hydrogen atom in a hydrogen molecule. The hydrogen atom weighs about 
1,800 times as much as an electron. Assume the speed of the atom to be 
such that its energy is the mean kinetic energy of a one-dimensional oscil- 
lator in temperature equilibrium at temperature 300° abs., or }ikT, where 
k = 1.31 X 10 -16 , T is the absolute temperature. Compare the wave length 
with the amplitude of oscillation of the atom. To find this, assume that it 
oscillates with simple harmonic motion, and that its frequency of oscillation 
is 3,000 cm" 1 . (The unit of frequency, cm -1 , is the frequency associated 
with a wave length of 1 cm.) Knowing the energy, mass, and total energy, it 
is then possible to find the amplitude. 

6. Consider, as in Prob. 5, the same hydrogen molecule at 10° abs.; an 
atom of atomic weight 100, in a diatomic molecule of two like atoms, similar 
to the hydrogen molecule, with the same restoring force acting between the 
atoms (therefore with a much slower speed of vibration, on account of the 
larger mass), at 300° abs.; at 10° abs. 

7. Consider whether the uncertainty principle is important in phenomena 
of astronomical magnitude. Assume a body of the mass of the earth (found 
from its radius of 4,000 miles, mean density 5.5), moving with a speed of 
20 km. per second. Now a measurement of the position is considered, in 
wave mechanics, to introduce an uncertainty in the velocity, determined in 
terms of the uncertainty in the measurement of position by the relation 
ApAq = h. Suppose that the position of the body was determined in space 
with an error of only 1 m. (a much greater accuracy, of course, than could 
be really obtained). Find the corresponding uncertainty in momentum, 
and the angle 9 through which the path is deviated by the measurement. 
Find how far from its original path the deviation would carry the body in a 

8. Conjugate foci in optics are points connected by an infinite number of 
possible correct paths. Thus by Fermat's principle the optical path, or 
length of time taken to traverse the ray, is stationary for each of these paths, 
meaning that the optical path is the same for each. Discuss this, showing 
that for the conjugate foci of a simple lens the optical path is the same for 
each ray, carrying out the actual calculation of time. 

9. Using the properties of conjugate foci mentioned in Prob. 8, prove 
that if a hollow ellipsoid of revolution is silvered, to form a mirror, the 
foci of the ellipsoid are optical conjugate foci. Prove that a paraboloidal 
mirror forms a perfect image of a parallel plane wave coming along its axis. 


The mathematical treatment of wave mechanics starts with a 
wave equation, similar to those of mechanical vibrations or of 
light. We shall not try to derive this equation from more 
fundamental principles, as we derive the equation of mechanical 
vibration from Newton's equations, or the wave equation of 
optics from Maxwell's equations; there are some ways of stating 
wave mechanics apparently somewhat more fundamental than 
.the wave equation, but they are not the best methods to start 
one's study with. We shall thus commence by postulating the 
wave equation, though arriving at its form by analogy with 
other cases. In this chapter we take only the form not involving 
the time, since this has a close analogy to optics. The form 
including the time is more remarkable, in that it involves com- 
plex quantities explicitly in its statement. We shall later 
treat it, separate variables in it, and show that the part inde- 
pendent of time is the equation treated in this chapter. This 
equation was first given by Schrodinger, and is called Schrodin- 
ger's equation. 

As we recall, the index of refraction, and wave length, of the 
waves vary from point to point. This means that the differential 
equation is very much like that of the nonuniform string, which 
we discussed in Chap. XIV. We shall be able to use the same 
approximate solution developed for that problem. We shall 
also get the condition for stationary waves, corresponding to 
the string held at both ends. This is the so-called quantum 
condition, and it now determines, not the overtones of a vibrating 
string, but the energy levels and stationary states of atoms and 
other systems. The problem, as in the string, leads to expansion 
in orthogonal functions, and we shall consider this theory in 
later chapters. 

209. Schrodinger' s Equation. — The wave equation of optics, 
after the time is eliminated, can be written v 2 w + (4V 2 /X 2 )w = 0, 
where u is the displacement. In the mechanical problem, 



h/\ = p = momentum. We assume a potential function V 
(wave mechanics is very difficult to formulate when there is no 
potential). Then the total kinetic energy is p 2 /2m, so that 
p 2 /2m + V = E, the total energy, and p = \/2m(E — V). 
Thus we have the equation 

V*u + ~P(E - V)u = 0, 


h 2 

87r 2 m 

V 2 w +Vu = Eu. (1) 

These are two forms of Schrodinger's equation in the form not 
involving the time. 

Suppose that a solution of this equation is u(x, y, z). Then the 
corresponding solution of the problem involving the time is this 
times an exponential function of the time. Since the frequency 
v is E/h, this is e 2wiEt/h u(x, y, z). We note that the differential 
equation for u, and hence the resulting solution, depend on the 
energy E, just as the function describing the shape of a vibrating 
string depends on the frequency. Hence we should properly 
use a subscript, u E (x, y, z). The general solution would now 
be a sum of such solutions for all different values of E, 

^A E e^ Et ' h u E (x, y, z), (2 ) 


as we had a sum of solutions as the general solution for the vibrat- 
ing string. 

210. One -dimensional Motion in Wave Mechanics. — For 
one-dimensional motion, where u is a function of x alone, Schrod- 
inger's equation becomes 

g + §£»(* - F)« = 0. (3) 

Since in general V is a function of x, this is an equation very much 
like that of the string with variable density but constant tension. 
Just as with that problem, we can easily set up an approximate 
solution of the problem, if the quantity E — V, corresponding 
to the density, does not change by too large a fraction of itself 
in, a wave length, though the exact solution is generally difficult, 
and has been worked out in only a few special cases. The 
approximate solution is easily shown, by the method used in 
Chap. XIV, to be 


constant ±-£h<ix 

where p has the value y/2m{E — 7), as before. This method 
of solution, as applied to wave mechanics, is often known as 
the Wentzel-Kramers-Brillouin method. It immediately leads 
to one result of physical interest, when we consider the amplitude 
of the wave. 

We have seen in the last chapter that the intensity of the 
wave measured the probability of finding the particle at the corre- 
sponding point, just as in optics the intensity of the light-wave 
measures the probability of finding the photon. Now, if we use 
the wave function given above, with its complex exponential, 
we must evidently multiply by its conjugate to get the intensity, 
or the square of its amplitude: 

constant T ipdx ^, constant --jfjpo* 
uu = —. . . -e X ,, -e 

J/E - V \/E - 7 

constant constant ,_. 

= , = (5) 

VE - 7 V 

To get the probability that the particle is in a small element of 
length ds, we must multiply through by ds, obtaining a con- 
stant X ds/p. But now suppose a particle were moving along 
the x axis according to the Newtonian mechanics, with the 
same energy E, in the same potential field 7. The length of 
time which it would spend in any small element of length ds 
would be ds/v, or m ds/p. Apart from the arbitrary constant, 
which could be determined to bring agreement, this is just 
like the quantum expression. If we knew that the classical 
particle was moving in this way, but did not know when it 
started, all we could say would be that the probability of find- 
ing the particle in a given region at any time was proportional 
to the length of time which it would have to spend in that region. 
In other words, our solution, of constant energy, corresponds 
to a classical particle whose energy is determined but whose 
initial time of starting is undetermined, and we can find from 
our wave function the probability of finding it in any region. 
To the approximation to which the Wentzel-Kramers-Brillouin 
solution is correct, the classical and quantum probabilities agree 
exactly, but they do not to a higher approximation. At any rate, 
however, we can say that the wave function is large in regions 


where the particle is likely to be, or is moving slowly, and is small 
where the particle is moving rapidly and is unlikely to be. It 
should be stated that sometimes, instead of the wave function 
with complex exponential, we use the corresponding real wave 

constant 2x , , .„. 

,, cos -j- J p dx. (6) 

^/E -V h 

In this case, the probability function has a factor of cos 2 -j- jp dx, 

introducing a sinusoidal fluctuation of probability which must be 
ignored in making comparisons with the classical probability. 

In the preceding paragraph, we have tacitly assumed that 
the kinetic energy E — V was always positive, so that p was 
real. But in many problems, as we have seen from our discus- 
sion of classical mechanics, this is true only in limited regions, 
and outside these regions p becomes imaginary. Even in this 
case, the method of Wentzel, Kramers, and Brillouin is still 
formally correct. But there are two physical differences. First, 

+ -JT- Jp dx is now real, so that we have a real exponential, 

either increasing or decreasing with x, depending on the sign. 
Secondly, to keep the whole function real, we must make the 
first factor constant/ -\/V — E, which amounts to changing 
the constant by multiplying by -\f^\. The approximate 
solution does not hold at all in the neighborhood of the point 
where the kinetic energy is zero, for there the wave length is 
infinite, and the assumption that E — V changes only a little 
in the distance of a wave length cannot be true. But we can 
easily see how to construct an approximate solution in this 
region, for the differential equation here is simply d 2 u/dx 2 = 0, 
the equation of a straight line; the actual curve of u against x, 
as we readily see, has a point of inflection at the point where 
E = V, being concave downward where the kinetic energy is 
positive, concave upward where it is negative. We can then 
take the exponential solution in the region of negative kinetic 
energy, and the oscillatory one in the region of positive kinetic 
energy, and join them by a line which is approximately straight. 
It is obvious, as we see for instance in Fig. 62, that, if we know 
beforehand the constants of the exponential solution (as for 
instance the amplitudes of the two terms, one increasing and 
the other decreasing exponentially, which we must add to get the 


complete solution) the initial value and slope of the sinusoidal 
solution must be definitely determined to make the two join 
smoothly. That is, the phase of the sinusoidal solution, or the 
amplitudes of sine and cosine functions which we add together, 

Fig. 62. — Joining of exponential and sinusoidal functions at point where 
p = 0. Upper curve shows potential and total energy against x, lower curve 
shows wave function. The exponential part of the function is so chosen that 
the amplitude of the term increasing exponentially with decreasing x is zero; 
otherwise the function would go to infinity instead of asymptotically to zero as 
x became negatively infinite. 

are determined. The same thing is true at every such boundary 
that we cross; if we once determine the two arbitrary constants 
in one part of the region, the whole function is determined, to 
make exponential and sinusoidal curves join smoothly. This 
must naturally be true, since the differential equation is one of 
the second order, with just two arbitrary constants. 


211. Boundary Conditions in One -dimensional Motion. — 

Suppose, first, that we consider a mechanical problem where the 
kinetic energy is always positive. Then there are no regions 
where the wave function is exponential; it is always sinusoidal, 
of finite amplitude. For any energy we have two solutions, 
which, bringing in^the time but writing in exponential form, are 

constant Jg m± f pdx) ^ 

■\/E - V ' K ' 

of which the real parts represent progressive waves traveling to 
left or right along the x axis. This corresponds to the fact that 
the corresponding mechanical particles can travel in either direc- 
tion, and, as we have seen, the intensity of the wave at any point 
properly agrees with the probability that the particle should 
be in that region, as computed classically on the assumption 
that we do not know when the particle started. 

Next let us assume that E — V remains positive to infinity in 
one direction, say to the right, but becomes negative to the left 
of a certain point, say x = x x , as in Big. 62. The solution will 
then be exponential to the left of x = xi. But in general it will 
be a linear combination of two exponential functions, one increas- 
ing exponentially in magnitude to <» as x approaches — <*> , the 
other decreasing exponentially to zero. If the amplitude of the 
former is different from zero, then the intensity of the wave will 
be infinite at — <x> , meaning that the probability of finding the 
particle at — oo is infinitely greater than of finding it anywhere 
else. This is ordinarily not the physical situation we wish to 
describe; hence we must assume that the amplitude is zero, and 
that the solution to the left of x\ has just the one term 

A I 

2w Cx /- 


A , eh J VMlK - Jt ' M , (8) 

which goes to zero at x = — °o . But now this comes up to the 
point x = x x with a definitely determined slope (or rather, the 
ratio of slope to function, in which the arbitrary constant factor 
cancels out, is definitely determined). Then there is just a very 
definite sinusoidal function which joins onto this, as Fig. 62 
suggests : the approximate solution for x > xi is given by 

. sin ( -^ 1 p dx + a). (9) 


It can be shown that for continuity of Eqs. (8) and (9) at x x we 
must have a = tt/4 = 45 deg. This statement means that the 
sine curve, instead of having a node at x h has already at that 
point passed through an eighth of a wave length. It is as if this 
eighth wave length were stretched out to infinity to form the 
exponential part of the curve. 

We have seen, then, that a boundary where E ~ V imposes a 
definite boundary condition on the solution. In our problem 
where the motion extends to infinity in one direction, the condi- 
tion can be always satisfied, by proper choice of phase and ampli- 
tude of the sinusoidal function, as we have seen. But there are 
two interesting results, of our calculation. First, the wave in 
the region where kinetic energy is positive becomes now a real 
function of position, or correspondingly a real function of time. 
In other words, it is a standing wave, not a progressive wave. It 
corresponds to superposed progressive waves traveling with equal 
intensity in both directions. The progressive wave approaching 
from the right is reflected at the boundary, and turns back with- 
out diminution of intensity on the reflection. The mechanical 
situation is that the particle, approaching the point where kinetic 
energy is negative, is reflected and turned back, just as it would 
be in the same problem in classical mechanics. But the other 
interesting result is that, on account of the exponential terms to 
the left of x = 0, the particles can slightly penetrate the region 
where kinetic energy is negative. On account of the rapid dying 
out of the exponential, this effect is not large, but we shall see in 
the next section that there can be cases where it is very important 
physically. This penetration by an exponential wave has an 
analogy in optics: a wave of light approaching an optically rarer 
medium at an angle greater than the critical angle is totally 
reflected, but at the same time, as we have seen in Sec. 168, 
Chap. XXIII, there is a disturbance, dying out exponentially, in 
the rarer medium, almost exactly equivalent to what we have 

212. The Penetration of Barriers. — The exponential penetra- 
tion of particles into the region of negative kinetic energy has as 
a result that in wave mechanics, unlike classical mechanics, a 
particle can go from one region of positive kinetic energy to 
another, even though there is a barrier of negative kinetic energy 
between. Such barriers are found, for example, in some cases 
at the surface of a metal, where the electrons in emerging from 


the metal, for example at high temperature in thermionic emis- 
sion, may find a surface layer of atoms, exerting on them such 
a strong repulsive force that they would be unable to penetrate 
on classical mechanics, but can in quantum theory. Suppose 
that we have a simple barrier of the sort shown in Fig. 63, where 
the potential has one constant value to the left of £ ,'a second 
high value between x and xi, and a third lower value to the right 
of xi, and where E — V is negative only between x Q and x\. The 
corresponding problem in metals is that where the region to the 
left of x Q represents the interior of the metal, that to the right of 
x\ the space outside, that between x and xi the surface layer or 




! E, 


Fig. 63. — Potential barrier. The barrier is between xa and xi. Motion with 
the energy E\ would have a wave function large only to the left of xo, rapidly 
decreasing to the right of x<>. With the energy Ei, the wave function would be 
large on both sides of the barrier, small but not zero within it, giving the possi- 
bility of penetrating the barrier. The wave function of energy E% would be 
large everywhere. 

barrier. An electron of low energy, as E h will be confined, except 
for a small exponential term, to the region to the left of x , or 
the interior of the metal, and will never escape. An electron of 
the very high energy E s will be able to escape, either on classical 
or quantum mechanics. But an electron of intermediate energy 
E 2 can penetrate the barrier and escape on quantum mechanics, 
but not in Newtonian mechanics. These electrons of high 
energy, as Ei or E% y are met only at high temperatures, so that 
we see the connection with thermionic emission. 

Consider an electron of energy E 2 , and a solution which to the 
right of xi is a progressive wave traveling to the right. Then 
within the barrier we should have a combination of the two kinds 
of exponential functions, one increasing exponentially to the left, 
the other decreasing, with amplitudes properly chosen to satisfy 
the boundary condition of continuity of the function and its 
slope at x\. These in turn will join onto two progressive waves 


to the left of x , one traveling to the right, one to the left. The 
final result may be described as follows: An incident progressive 
wave falling on x from the left; a reflected wave in the region to 
the left; a transmitted wave to the right of x\, the transmission 
through the barrier being of the real exponential form. We can 
tell something without much trouble about the amount trans- 
mitted. For within the barrier the term 

Constant -j- J y/2m(V-E) dx 

\/V - E 6 

increasing exponentially to the left, is the important one. And 
we readily see that its amplitudes at xi and x measure, at least 
in order of magnitude, the relative amplitudes of transmitted and 
incident waves. Thus the fraction transmitted depends on the 

— 2tt Cxi , 

square of the quantity e h * Xo m x . We work out examples 

of this integral in the problems, showing that there can be barriers 
of atomic size small enough so that appreciable penetration takes 
place, though in general this is not true, since a small increase in 
the height or breadth of a barrier can, on account of the exponen- 
tial, make an enormous difference in the ease of penetration. 

213. Motion in a Finite Region, and the Quantum Condition. — 
Assume next that the kinetic energy is positive only in a finite 
region, so that classically the motion would be limited to that 
region. Then there will be a boundary condition on the wave 
function at each boundary of the region. Just as with the string 
held at both ends, this condition cannot in general be satisfied; 
it can be satisfied only for certain energies (corresponding to 
certain frequencies with the string). Using the approximate 
method of Wentzel, Kramers, and Brillouin, it is easy to see the 
nature of this condition. For each boundary must have essen- 
tially the treatment of Fig. 62, only the exponential decreasing 
toward infinity being allowed, whereas with an arbitrary energy 
the exponential would increase toward infinity in at least one 
direction. We have seen that the exponential part of the curve 
corresponds to | wave length of the sinusoidal part. The num- 

ber of wave lengths between x\ and x 2 is I ^ dx. Thus the whole 

Jx t tl 

number of waves between — <*> and °° , taking account of the two 

C X2 v 1 

exponential ends, is I £ dx + j- Since the function goes to zero 


at both limits, this must be a whole number of half waves, or 
twice it must be a whole number. Hence 

i Li dx+ \ 

2 1 p dx = ( n + ^jh, n = 0, 1, 2, 

2|J«fc + 2- 1,2,3, 


This is the so-called quantum condition, developed particularly 
by Sommerfeld. We must remember that, since it is based on 
the approximation of Wentzel, Kramers, and Brillouin, it is not 
necessarily an exact condition. In some cases, as the linear 
oscillator, taken up in Prob. 5, it proves to be exactly true. In 
other cases, as a particle moving freely between two reflecting 
walls, as considered in Prob. 10, a similar condition holds, except 
that the quantum number, which here is {n + |), a half integer, 
is instead a whole integer. There are still other problems, as 
the hydrogen atom, in which a modified form of the condition 
is correct. In most cases, however, the quantum condition gives 
only an approximation, though a good one. 

A number of problems can be solved exactly when the motion 
is confined to a finite region, and it is by comparison with these 
exact solutions that one can check the method of Wentzel, Kram- 
ers, and Brillouin, and the quantum conditions. Thus, in Prob. 
5 we show that the wave equation for the linear oscillator can be 
solved as an exponential times a power series. This power 
series in general diverges for large x, indicating a function which 
goes to infinity as x becomes infinite. But if we give the energy 

p dx 

= (n + \)h, the series breaks off to form a polynomial, and the 
function goes to zero at infinity. These are the only solutions 
we can use, and they give just the quantum condition we found 
before, though by a quite different method. Again, a rotator, 
a solid of fixed moment of inertia and constant angular momen- 
tum rotating on an axis in the absence of torques, has a wave 

function e ~ h , where p, 8 are angular momentum and angular 
rotation. Since p is constant, the real forms of this are sin (or 
cos) (+ 2Tp6/h). For this to represent a single-valued function 
of position, it is necessary, as with the circular membrane, to 
have the function periodic with period 2x in 6. Thus we must 


have 2tp/Ji = integer = m, giving whole integral quantum 
numbers in this case, and determining the angular momentum 
as m h/2w. 

214. Motion in Two or More Finite Regions. — In classical 
mechanics, we do not have to discuss specially the case where 
there are two separated regions where the kinetic energy is 
positive, separated and bounded by regions where it is negative; 
the motion occurs in one or the other of these regions, and that 
is the end of it. But in wave mechanics, the barrier between 
regions is not entirely impenetrable. We shall not go into the 
mathematical details of the solution, for, while they involve 
no new ideas, they are rather tedious. But the result is that 
the particles can penetrate the barrier and go from one region 
to the other, just as we have seen in a previous section in consider- 
ing a barrier between two regions each extending to infinity 
in one direction. There are some new situations, however. 
Each region by itself would have stationary states of its own, 
if the other were not there. But with the two, no one of these 
states refers to motion wholly in the one region; the particle 
can go back and forth from one to the other. However, if the 
energy level is one that is characteristic of the one region and 
not of the other, the particle spends almost all of its time in 
that region of which its energy is characteristic. Once in a while 
it leaks over to the other side, but it soon finds its way back. 
It may be, however, that a given energy level will be char- 
acteristic of both regions; this is surely true if they are identical 
regions. Then the particle will travel back and forth from one 
to the other, spending equal lengths of time in each. This is 
an important physical case. For instance, in the hydrogen 
molecule, both atoms are just .alike, and an electron finds a 
potential field which has two minima, one at each nucleus. It 
then can oscillate back and forth, spending half its time about 
one nucleus, half on the other. These problems are closely 
analogous to that of coupled oscillators, which we have already 
taken up. There we found that one oscillator would not move 
without setting the other into vibration, and similarly here the 
wave function cannot be large in one region without having a 
value in the other also. And here we have a special case if the 
two regions are identical, as we did before if the oscillators were 
equivalent. We shall find that the whole mathematical treat- 
ment is closely analogouf , 


We can finally have motion in two regions, one finite, the other 
reaching to infinity. Then, if the particle starts in the finite 
range, it is able in time to leak across the boundary, and go off to 
infinity. The present explanation of radioactivity is based on 
this idea. An alpha particle is supposed to be held in an atomic 
nucleus by a restoring force pulling it to a position of equilibrium. 
But if it were outside, then being positively charged, it would 
be repelled from the positive nucleus, the repulsion going to 
zero at infinity. Thus we should have a potential curve as in 
Fig. 64, where potential is drawn as function of r, the distance 

Fig. 64.— Potential curve for radioactive disintegration. A wave function 
of energy approximately E, starting out as a wave packet within the valley of 
the potential curve, would gradually leak out through the barriers. 

from the center of the nucleus to the escaping alpha particle. 
If now the alpha particle has energy E, and is originally within 
the nucleus, it will eventually leak out, going off to infinity 
with a large kinetic energy, as the ejected alpha particles are 
actually found to have. 


1. Prove that the function co nstant e ±£ T^ p dx , where p = \/2m{E - V), 

\/E - V F V A J, 

is an approximate solution of Schrodinger's equation, becoming more and 
more accurate as V changes by a smaller and smaller amount in a wave 

2. Note that in Bessel's equation for J m , when m > 0, there is a region 
near the origin where the Wentzel-Kramers-Brillouin approximate solution 
is exponential rather than sinusoidal. Discuss the solution qualitatively 
for x < m, where m is fairly large, showing how this solution joins onto the 
sinusoidal one found in Prob. 9, Chap. XIV. * 


3. Note that the solution of Schrodinger's equation is sinusoidal or expo- 
nential in a region of constant potential. Discuss the one-dimensional 
problem of particles going from one region with constant potential Vi to a 
second region of constant potential V z , when the energy is great enough so 
that the kinetic energy is positive in both regions. Satisfy boundary condi- 
tions at the surface, making u and its derivative continuous, joining the two 
sinusoidal functions together at the boundary. Show that some of the 
incident particles travel across the boundary, but that some are reflected 
back, contrary to classical mechanics. Find the fraction reflected. 

4. Assuming the potential function of Fig. 63, consider particles striking 
the barrier from the left with energy E%. Set up the solution, satisfying 
the boundary conditions at x and xi, and get an expression for the reflection 
coefficient as a function of the height of the barrier. Show that the reflec- 
tion coefficient approaches unity if the barrier is infinitely high, or infinitely 

p dx for an oscillator of natural frequency v, 

energy E, equals E/v. Show that therefore the quantum condition leads 
to the energy levels E = (n + %)hv for the oscillator. 

6. Solve the problem of the linear oscillator of frequency v, where V = 

2ir i v 2 mx i . To do this, set 

2 * ifnv x , 
u = e h v(x), 

and set up the differential equati on for v(x). For convenience, introduce 
the change of variables y = 2ir\/mv/h x. Solve in series, and show that 
the resulting series breaks off only if E = (n + %)hv, where n is an integer. 

7. Using the series of Prob. 6, investigate the behavior for large x if the 
series does not break off. Show that for very large x, v approaches the 

/iir 2 mv 

series for e h , so that the whole function u increases exponentially with 
x 2 , and cannot be used as a wave function. 

8. Compute and plot wave functions of the linear oscillator corresponding 
to n = 0, 1, 2, 3, 4. From the graphs find the region in which the solution 
is oscillatory (that is, the region between the points of inflection). Draw 
the potential curve and the values of E corresponding to these four stationary 
states, and show that the motion is oscillatory in the region where the 
kinetic energy is positive. 

9. Set up the approximate solution for the linear oscillator problem by 
the Wentzel-Kramers-Brillouin method, getting expressions for the functions 
in both the sinusoidal and the exponential ranges. Investigate to see how 
well these functions join on at the point of inflection. 

10. Compute and plot the approximation of Prob. 9 corresponding to 
n = 4, and compare with the exact solution. 

11. A particle executes one-dimensional motion in a container, having 
constant potential inside, and with the potential becoming suddenly infinite 
at the walls, so that the particle never gets out. Show that the boundary 
condition is that the wave function must be zero at the walls, as the dis- 
placement of a stretched string is zero at its ends. Find the wave functions 
of the problem, and find the energy of the particle in the nth stationary state. 




The quantum condition has a close connection with the phase 
space and the Hamiltonian method, which we have discussed in 
Chap. IX. Hamiltonian methods have, in fact, been the guiding 
principle in the development of the quantum theory. At the 
same time, the phase space is fundamental to statistical mechan- 
ics, the mathematical foundation of thermodynamics. For that 
reason, we may profitably treat these subjects together, though, 
of course, statistical mechanics can be developed entirely from 
the basis of classical theory. Nevertheless, on account of the 
essentially statistical nature of the quantum theory, it yields 
an almost more natural approach to statistical mechanics than 
is possible in Newtonian mechanics, and by developing the two 
together we can illustrate the correspondence between classical 
and quantum mechanics which must hold, since the classical 
theory is a correct limiting form of quantum theory for large- 
scale problems. 

215. The Quantum Condition in the Phase Space. — In Fig. 11, 
Chap. IX, we show the phase space for a linear oscillator, with 
a line of constant energy E, an ellipse of semiaxis \^E/2ir 2 mv 2 
along the q axis, and s/lmE along the p axis. These quantities 
measure the maximum coordinate and momentum, respectively, 
which a particle of E attains during its motion. For such an 
oscillator, the quantum condition (10), Chap. XXIX, equates 
twice the integral of pdq between the minimum and maximum q 
values to (n + %)h. Just as Jy dx measures the area under the 
curve y (x), so fp dq measures the area under the curve p{q) 

in the phase space. The integral I 2 p dq is that part of the area 

of the ellipse above the q axis, and to get the whole integral we 
double this, obtaining also the integral below the q axis. This 
may be written as an integral around the contour, .from q x to q 2 
around the upper branch of the curve, then back to q\ along the 



lower part of the curve, in which p and dq are both negative, 
so that we contribute an equal positive term to the integral. 
In other words, the quantum condition may be written 

fp dq = (n + %)h, (1) 

wnere £ indicates an integral around the contour. And the 
physical interpretation is that the quantum integral is the area 
of the ellipse. Since this is irdb, where a and b are the two 
semiaxes, it is ir\/2niE\/E/2Tr 2 mv 2 = E/v, giving from Eq. (1) 
E = (n + %)hv, in agreement with the result of Prob. 5, Chap. 

The results of the last paragraph are general: with any one- 
dimensional motion the quantum integral fpdq represents 
the area of phase space enclosed by the path of the representative 
point, and the quantum condition says that this area is 
(n + i)h, approximately. If we take successive stationary states, 
connected with successive quantum numbers n, each will have 
a curve in phase space, the path of a representative point of the 
corresponding energy, and the area between successive curves 
will, by the quantum condition, be h. Thus the phase space is 
divided up by these paths into a set of cells, each of area h, 
one for each stationary state. 

216. Angle Variables and the Correspondence Principle. — We 
have seen in Ghap. IX, Sec. 59, that a change of variables, called 
a contact transformation, can be set up, in which the new coordi- 
nate w increases uniformly with time, and the momentum J 
stays constant. To visualize this transformation in the case 
of the oscillator, we may imagine the phase space plotted with 
such scales of coordinates and momenta that the ellipses of 
constant energy become concentric circles. Then the new 
variables are essentially polar coordinates in phase space, the 
coordinate being the angle divided by 2-tt, the momentum being 
proportional to the square of the radius, so that obviously the 
angle variable increases uniformly with time, the momentum 
staying constant. The momentum /, called the action variable, 
proves in fact to be precisely the area of the ellipse, or circle, or 
the same integral fpdq which we meet in the quantum condition. 
In terms of the action variable, often called the phase integral, 
we saw that Hamilton's equation 

dH dw ._. 

«7 " W " ' (2) 


gave the frequency in terms of a simple calculation. This 
formula permits us to make an extremely interesting connection 
between the classical frequency of motion of a system and the 
frequency of the light emitted in a transition between two states 
of energy E 2 and E\ according to Bohr's frequency condition 

E 2 - Ei = hv, (3) 

described in Sec. 201. On the quantum theory, most energies H 
of the system are not allowed; we may have rather only those 
satisfying the quantum condition (1). Thus H cannot be 
regarded as a continuous function of J. We may, however, 
replace the derivative dH/dJ of (2) by the difference ratio 
AH/AJ, in which AH is the energy difference between two states, 
AJ the difference between their phase integrals. If we choose 
two states whose quantum numbers differ by unity, we have 
AJ = h, so that the difference ratio is 

h-~ =V > 

giving precisely the quantum frequency according to Eq. (3). 
Hence we have the following relation: the derivative dH/dJ 
gives the classical frequency of motion of a system; the difference 
ratio AH/AJ, where the difference of J is one unit, gives the 
frequency of emitted light according to the quantum theory, 
or the frequency of oscillation of the oscillator mentioned in 
Sec. 202. We shall consider later the significance of transi- 
tions of more than one unit in J. 

For the oscillator, as one can immediately see from the fact 
that its energy in the nth state is (n + $)hv, the classical and 
quantum frequencies are exactly equal, the derivative equaling 
the difference. This is plain from the fact that here E = Jv, 
so that the curve of E against J is a straight line, and the ratio 
of a finite increment in ordinate, divided by a finite increment 
in abscissa, equals the slope or derivative. But for any other 
case the curve of E against J is really curved, so that the deriva- 
tive and difference ratio are different, and classical and quantum 
frequencies do not agree. Thus in Fig. 65 we show an energy 
curve for an anharmonic oscillator, in which the tightness of 
binding decreases with increasing amplitude, the frequency 
decreases, and therefore the slope decreases with large quantum 
numbers. Here the classical frequency, as given by the slope 
of the curve, does not agree with the quantum frequency con- 



nected with the transition indicated, from % to %, for the 
quantum frequency is the slope of the straight line connecting 
E 2 and El We may assume, however, that if we go to a very 
high quantum number, so that we are far out on the axis of 
abscissas, any ordinary energy curve will become asymptotically 
fairly smooth and straight, so that the chord and tangent to 
the curve will more and more nearly coincide. This certainly 
happens in the important physical applications we shall make. 
In these cases, we may state Bohr's correspondence principle : 

j£ h %h %h 

Fig. 65. — Energy curve for anharmonic oscillator. Slope of curve gives 
classical frequency, slope of straight line connecting E 2 and Ex gives quantum 

in the limit of high quantum numbers, the classical and quantum 
frequencies become equal. This is essentially simply a special 
case of the general result stated in Chap. XXVIII, that in the 
limit of small wave lengths (which for most practical purposes 
is the same as the limit of high quantum numbers) the classical 
and quantum theories become essentially equivalent. 

217. The Quantum Condition for Several Degrees of Freedom. 
In classical mechanics, we have seen that certain problems, like 
the two-dimensional oscillator and the central field motion, are 
separable, so that they can be broken up into several one-dimen- 
sional motions. Since each of these motions was periodic, the 


whole motion is multiply periodic in these cases. In these partic- 
ular problems with several degrees of freedom, separation of 
variables can also be carried out in the quantum theory. In 
phase space we can pick out the two-dimensional space represent- 
ing one coordinate and its conjugate momentum, and the projec- 
tion of the representative point on this plane will trace out a 
closed curve. There is a quantum condition associated with this 
coordinate, the area enclosed by the curved path in the two- 
dimensional space being a half integer times h. Thus we have 
a quantum number associated with each degree of freedom in 
such a problem. Further, we can introduce angle and action 
variables connected with each of the coordinates, just as if each 
formed a problem of one degree of freedom. The various fre- 
quencies of the multiple periodicity can be found by differen- 
tiating the energy with respect to the various J's, and the 
correspondence principle can be applied to connect these classical 
frequencies with the quantum frequencies associated with various 
possible transitions. 

It can be shown in general that any coordinate of, say, a doubly 
periodic motion, can be analyzed into a sort of generalized Fourier 
series in the time, in which terms appear of frequencies 

where n, r 2 are arbitrary integers. This is the generalization 
of the ordinary Fourier representation for a purely periodic 
motion, in which all frequencies nv x will in general appear which 
are integral multiples of the fundamental frequency. Now we 
can carry out a general correspondence between any one of these 
overtone or combination frequencies and a corresponding transi- 
tion. Thus let us consider the transition in which Ji changes 
by ti units, J 2 by t 2 units, where Ji and J% are the two action 
variables. The quantum frequency emitted will be 

E(Ji, Jt) - E(Ji ~ rji, J 2 - T 2 h) ^ ^ 


where E is the energy, written as function of the J's. But if we 
are allowed to replace differences by derivatives, as we assume we 
are in the correspondence principle, this becomes 

l/dE , . BE ,\ , „ 


in agreement with Eq. (4), if Vl = dE/dJ h v 2 = dE/dJ 2 . Thus 
we have a one to one correspondence between all possible over- 
tone vibrations of the classical motion and all possible quantum 
transitions. This correspondence is of great importance, for 
instance, in discussing intensities of radiation, as we shall see 
later. For each component of the Fourier representation is 
a sinusoidal vibration, with frequency (4), and a certain ampli- 
tude A Tl , rx . This oscillation, if it were the oscillation of an 
electric charge, would send out a radiation of frequency (4), 
with an intensity proportional to the square of the amplitude, 
as we have seen in Chap. XXV, where we found a rate of radia- 
.. 16ir 4 AV _,, , . ^ 

— 3c* * Fourier component A would directly 

determine the intensity of classical radiation. It then seems 
very reasonable that, at least in the limit of high quantum num- 
bers, this intensity would agree approximately with the intensity 
of the corresponding quantum transition given by Eq. (5). Thus 
one can derive from correspondence principle definite information 
about probabilities of quantum transitions, for the rate of radia- 
tion of energy in a particular transition is proportional to the 
number of transitions occurring per unit time, or the probability 
of transition. We shall return to this question in a later chapter. 
The results which we have mentioned are all for multiply 
periodic, separable problems in several dimensions. With an 
n-dimensional problem, and a 2n-dimensional phase space, there 
are n J's which stay constant during the motion. Thus we may 
set up n sets of surfaces, J x = constant, J 2 = constant, . . . 
/„ = constant, in the phase space, and the representative point 
moves so that it stays on an intersection of all n surfaces, or in 
an n-dimensional region, instead of all through the (2n - 1) 
dimensional energy surface, as it would in quasi-ergodic motion. 
The particular surfaces J x = (m + $)h, J 2 = (n 2 + i)h, etc., 
divide up the phase space into cells, each of which is seen to have 
the volume h n , at least in simple cases, and a little examination 
shows that there is just one stationary state per cell. Of course, 
the path of a representative point is always on an energy surface, 
and if we take only the quantized J values, the corresponding 
representative points lie only on the energy surfaces correspond- 
ing to quantized energy values. In many cases it proves to be 
true that a number of different stationary states have the same 
energy. Such a problem is called degenerate, and the number of 


different states connected with the energy level is called the a 
priori probability of the level. In such a case the volume of 
phase space between this energy surface and the next adjacent 
one proves to be h n times the a priori probability. 

For a quasi-ergodic system, as we have said, there are no 
quantities like the J's which stay constant, other than the energy. 
There are still stationary states in the quantized problem, though 
they are not determined by ordinary quantum conditions. They 
are derived from solutions of the Schrodinger equation, however, 
and the boundary conditions lead to definite stationary states, 
as with one-dimensional motion. Thus we can always introduce 
energy surfaces in the phase space, corresponding to the quantized 
states. Generally quasi-ergodic systems are not degenerate, all 
energy levels being distinct, and the volume of phase space 
between successive energy levels will always be, at least to an 
approximation, equal to h n . These relations prove to be of 
importance in investigating the statistical mechanics of collections 
of systems in the phase space. 

218. Classical Statistical Mechanics in the Phase Space.— In 
Chap. IX we have investigated the motion of a representative 
point in the phase space. Statistical mechanics, however, like 
any statistical science, deals not with single points but with an 
enormous number of individuals, investigating their average 
behavior. In its applications to thermodynamic problems, there 
are two principal methods, both of which are frequently used. 
In the first of these, we deal, for instance, with a gas composed 
of a great many identical molecules. These molecules them- 
selves form the individuals whose average properties we investi- 
gate. Thus the phase space we use is one in which there are 
enough coordinates and momenta to describe a single molecule. 
Such a space is often called a n space. The second method is 
more powerful but more abstruse: the individuals with which we 
deal are whole systems, as whole samples of gas, and we imagine 
a large collection, often called an ensemble, or assembly, of such 
samples, all just alike in such gross properties as volume, tem- 
perature, and density, but differing in their finer details, as the 
positions and velocities of individual atoms or molecules. These 
might represent different pictures of the same gas at different 
times; or they might represent different repetitions of the same 
experiment, all controllable conditions being held fixed. Finding 
averages over such ensembles means then finding the time aver- 


age, or finding the average obtained by repeating the experiment 
many times. The phase space required for this second method 
has as many coordinates and momenta as there are in the whole 
system, a very great number if the system contains many mole- 
cules. This space is often called the r space. As to the dis- 
tinction between the methods ,of the /j, and the T spaces, the 
general situation is that they are equivalent when applied to 
perfect gases; but if the molecules interact, they can no longer be 
treated as independent systems and described by separate points 
in the p, space, but one must instead consider the whole system 
together, and use the V space. The latter method is then the 
one which we shall use more often. Both methods are alike, 
however, in using phase, spaces, and in considering the motion of 
a swarm of points in such a space. 

We imagine an ensemble of a great many, or even an infinite 
number, of points in a phase space. As time goes on, with the 
points moving, the effect is as of the whole swarm flowing, like a 
liquid or gas composed of atoms. In fact, many of the ideas of 
hydrodynamics can be applied in this case, as we shall show in 
the next section. We introduce -first the density of points as a 
function of the p's and q's: 

f(pi . . . p n , qi . . ■ q n )dpi . . . dpndqt . . . dq n 

gives the number of points in the 2n-dimensional volume element 
dpi . . . dp n dqi . . . dq n . The velocity of points in the phase 
space is then given by Hamilton's equations, dqi/dt = dH/dpi, 
dpi/dt = —dH/dqi, as we pointed out in Sec. 52, Chap. IX. 
Thus we have the necessary' quantities to describe the motion of 
the points as a flow, and in the next section we apply the equation 
of continuity and investigate its consequences. 

219. Liouville's Theorem. — Consider the steady flow of a 
fluid of density p, velocity v. The equation of continuity is 
dp/dt + div (pv) — 0, or dp/dt + p div v + v • grad p = 0, (6) 
if the density varies from point to point. We are interested 
particularly in a divergenceless flow, for which div v = 0, for it 
turns out that the flow of points in the phase space is of this sort. 
It is easy to see that this corresponds to the flow of an incom- 
pressible fluid. For let us find dp/dt, the time ra,te of change of 
density with time. This is given by 

dp _ dp dp dx dp dy _ dp 

* ~ Tt + Tx U + ay H + " at + v grad p ' {7) 


where dp/dt is the rate at which density changes if we follow 
along with a particle of fluid. But now if div v = 0, Eq. (6) 
becomes dp/dt = 0, showing that the density following the 
particle does not change with time, which is to be expected if the 
fluid is incompressible. This does not imply, however, that 
the density of the fluid is at all points the same. Let us imagine a 
fluid composed of large droplets of one sort of fluid suspended in 
another. If the fluids are chosen so that they do not mix, and 
the surfaces of separation remain sharp, then the density will 
change from point to point, as we go from the one fluid to the 
other. Further, if the whole fluid is moving, the density at any 
point of space will change with time, as first the one sort of fluid, 
then the other, will be carried past this point. But if the fluid 
is incompressible, the density of a particular part of the fluid, as 
we follow it in its motion, will be constant. That is, v • grad p 
and dp/dt are separately different from zero, but their sum 

The situation we have just described holds for the motion of 
points in the phase space. The 2n-dimensional velocity of points, 
as we have seen in the last section, has components dqi/dt, dpi/dt, 
where i goes from 1 to n. Then the analogue to the divergence is 

div v = — ^ + — ^ + • • • + — ^ + • • • 
dqi dt dq 2 dt dpi dt 

6 dH , d dH , d dH _ n /0 , 

dqi dpi dq 2 dp 2 dpi dqi 

Thus on account of Hamilton's equations the flow is divergence- 
less. Then we see that the flow is an incompressible flow, the 
density of points remaining constant as we follow a particle. 
This is Liouville's theorem. 

220. Distributions Independent of Time. — The principal use 
of distributions in the phase space is for thermodynamic pur- 
poses, and here we are interested in thermal equilibrium, and 
in distributions independent of time. An ensemble independent 
of time is one for which dp/dt = 0. To get that, we see from 
Eq. (7) that we must have v • grad p = 0. This means that 
the rate of change of density along the direction of flow, or 
along the streamline, is zero. In other words, all along a single 
line of flow, or through a single tube of flow of infinitesimal cross 
section, the density is constant. We may imagine the whole 
phase space divided up into tubes of flow, and then any distribu- 


tion in which each tube has its own constant density through 
its whole volume, no matter how this density may change from 
one tube to another, will be independent of time. 
' In a multiply periodic motion, the lines of flow will be given 
by J i = constant, J 2 = constant, • • • . Thus if we make the 
density any arbitrary function of the J's, we shall have a distribu- 
tion independent of time. Remembering that the density is 
the function /, this is 

/(Pi * ' * Vn, qx •'• ' q n ) = F(JiJ 2 • • • J»). (9) 

On the other hand, in a quasi-ergodic motion, a single line of flow 
will in time come arbitrarily near to every point of an energy 
surface. Thus the only distribution which will be independent 
of time in this case is one in which the density is constant all 
over an energy surface: 

/(Pi • • • P», qx • • • <?») = F(E). (10) 

Of course, the ensemble (10) would be independent of time even 
in a multiply periodic motion, but it is more specialized than is 
necessary in that case. For instance, in an ensemble of systems 
each consisting of a particle in central motion, we could make an 
ensemble in which all parts of the phase space corresponding 
to the same energy had the same density, and this would be 
constant. But we could equally well make the density in differ- 
ent parts of the space corresponding to the same energy but 
different angular momentum different, and still, as long as 
the angular momentum was conserved, this distribution would 
be constant. Any perturbation which involved slow changes 
of angular momentum, however, would destroy the constancy 
of this distribution, whereas if we had started with one which 
depended only on energy, it would not be affected by such a 

The ordinary systems which we deal with thermodynamically 
are assumed to be so complicated that they are quasi-ergodic. 
Thus the only type of ensemble independent of time is that of 
Eq. (10), in which the density is a function of the energy. This 
is the sort which we shall consider in thermodynamic applications. 

221. The Microcanonical Ensemble. — A particularly impor- 
tant ensemble is that called the microcanonical ensemble, in 
which all the systems of the ensemble have practically the same 
energy. More precisely, we have 


f(pi • ■ • Pn, qi • • • ?») = F(E) 

= constant for E Q < E < E + AE 
= otherwise. (11) 

It is evident that an arbitrary ensemble can be made up by 
superposing microcanonical ensembles, the ensemble whose 
systems lie between E Q and E + AE having a constant density 
so chosen as to give the proper density in that particular part 
of the energy space. In thermodynamics the microcanonical 
ensemble is often used, when we wish to deal with the statistical 
properties of systems at a given temperature, for energy content 
is correlated with temperature in such a way that systems of the 
same temperature have just about the same energy, and therefore 
are represented at least approximately by a microcanonical 

222. The Canonical Ensemble. — More suitable than the 
microcanonical ensemble for discussing temperature equilibrium 
proves to be a slightly different one called the canonical €insemble« 
In this distribution the density function is given by 

/ = p(E) = constant e kT , , (12) 

where E is the energy, k a constant, called Boltzmann's constant, 
equal to 1.37 X 10 -16 c.g.s. units, T is the absolute temperature. 
We shall discuss in a later chapter the particular properties 
of this ensemble, and its advantages. This ensemble has not 
only the property of remaining unchanged with time, if the 
system is left to itself, but also of remaining unchanged if the 
system can interchange energy with another of the same tempera- 
ture. This is evidently necessary for thermal equilibrium, 
and the canonical ensemble is the only one in general which has 
this property. From this ensemble we can derive interesting 
results, though we mention only a few. We may, for instance, 
use the fj, space, each system being a molecule. The energy 

of such a molecule is ~—(p x 2 + p v 2 + p* 2 ) + V, so that the 

probability of finding a molecule having its coordinates and 
momenta within the limits x and x + dx, y and y + dy, 
z and 2 + dz, p x0 and p x0 + dp x , etc., is proportional to 


dxdydzdpxdpydpz. (13) 


This law is ordinarily called the Maxwell-Boltzmann distribu- 
tion law. From it we can easily find that the velocities are 
distributed according to Maxwell's distribution of velocities, 
and that the density in ordinary space at different points is 
proportional to e~ v / kT . We leave these proofs for problems. 
If on the other hand we use the r space, E represents the energy 
of the whole sample of gas, and we can prove easily that the 
energy of an individual sample in the ensemble is very nearly 
the same as that of any other sample. Thus for such a system 
the canonical ensemble is very similar to the microcanonical 
ensemble. One gets the same thermodynamic results using 
either ensemble, but the canonical ensemble is both more correct 
theoretically and decidedly simpler for most of its applications. 

223. The Quantum Theory and the Phase Space.— In Sec. 210, 
Chap. XXIX, we have seen that a stationary state of a one- 
electron problem corresponds to a classical particle whose 
energy is determined, but whose initial time of starting is undeter- 
mined. More accurately, it corresponds to an ensemble of 
particles, all of the same energy, but with phases distributed 
in such a way that the properties of the ensemble are independent 
of time. This, however, is exactly a microcanonical ensemble. 
This may be connected with the uncertainty principle for energy, 
Eq. (2) of Chap. XXVIII, which states that the uncertainty 
of energy multiplied by the uncertainty of time is equal to h. 
If then we set up an ensemble of particles all of exactly the same 
energy, it must necessarily be true that the uncertainty of time 
of one of the particles is infinite. That is, we know nothing at 
all as to its phase, or the ensemble consists of particles in all 
possible phases. And since it is a stationary state we are dealing 
with, nothing depends on time. In other words, with the 
quantum theory, the mere process or setting up a stationary state 
automatically sets up a microcanonical ensemble. We need not 
do that specially, and we need not prove Liouville's theorem 
to find out how to get ensembles independent of time. In this 
way the quantum mechanics is more convenient for statistical 
purposes than classical mechanics. 

With problems with several variables, the stationary state 
certainly represents an ensemble independent of time. If the 
problem is multiply periodic, it will represent an ensemble of 
states all of the same J values (that is, the same set of quantum 
numbers), but arbitrary phases. On the other hand, if it is 


quasi-ergodic, it will represent a microcanonical ensemble. 
And even in a multiply periodic, degenerate case, where there 
are several stationary states of the same energy, we can always 
set up a microcanonical ensemble, by combining all the various 
states of the same energy. Each one of these states will corre- 
spond to a volume h n of the phase space. Then if the micro- 
canonical ensemble is to have a constant density of points over 
a region between two energy surfaces, it will have a definite 
number of points for each element of volume h n , and hence a 
constant and equal number of points for each of these substates 
of the same energy. We may say that in this ensemble the 
number of systems in any group of substates is proportional 
to the a priori probability of this group of states; that is, simply 
proportional to the number of substates in the group. 

The distribution function f(p x • p„, q x • q n ) for the quantum 
theory involves us in rather complicated considerations, which 
we shall take up in the next chapter. The reason is that the 
probability function which we are given directly is the square 
of the wave function, #, and that is a function of the coordinates 
only, giving the probability of finding the coordinates within 
certain limits, independent of the momenta. In Sec. 210, 
Chap. XXIX, we have shown that this probability function 
approximately agrees with that found in classical mechanics. 
We postpone other comparisons between the quantum and 
classical distributions. But there is one feature of the quantum 
distribution function which should be mentioned at the outset. 
We have spoken above as if one could draw the paths of particles, 
and set up distribution functions, in the phase space, for the 
quantum theory as for the classical theory. But this is really 
not possible, as we can see from the uncertainty principle. 
This says that the uncertainty in the coordinate of a particle, 
multiplied by the uncertainty of its momentum, is of the order 
of magnitude of h. This product of uncertainties is simply an 
area in phase space. Instead of representing the particle by 
a sharp point, we can visualize it as a region in phase space, of 
dimensions Aq and Ap along the two axes. By the uncertainty 
principle, the area of this region is h. If we had the same 
thing in a number of dimensions, as n variables, the 2n-dimen- 
sional volume associated with the uncertain position and momen- 
tum of the particle or representative point would be h n , just the 
volume associated with a stationary state. As a result of this 


uncertainty, we must always be cautious about using the ideas 
of definite paths of representative points in the phase space. 
It would perhaps be more accurate to think of the paths, and 
energy surfaces, as having definite thicknesses, as if the point 
carried along its volume h n , and allowed that to trace out a finite 
region of phase space. 

The canonical ensemble can be set up in quantum theory as in 
classical mechanics. In the classical theory, it is the ensemble 
in which the number of points per unit volume is proportional 
to e~ E/kT . In quantum theory, the number of points in volume 
h n , or the number in a given stationary state, is proportional 
to e~ E/kT , or this exponential is proportional to the probability 
of finding a system, chosen at random from the ensemble, in the 
stationary state in question. If we group together a number of 
degenerate substates all of energy E, and if there are g of them, 
so that the a priori probability of the group is g, the number of 
systems in the group is proportional to ge~ E/kT . 


1. Take the problem of a particle executing one-dimensional motion in a 
container with constant potential inside, but impenetrable walls, as in Prob. 
11, Chap. XXIX. Plot the path of the representative point in phase space, 
find the phase integral, and show that the quantum condition leads to the 
same stationary states and energy levels that were determined previously, 
except that it leads to half rather than whole quantum numbers. 

2. For the system of Prob. 1, compare (a) the frequency of oscillation of 
the particle back and forth between the walls, as determined classically by 
elementary argument; (b) the same frequency as determined by the formula 
dH/dJ; (c) the emitted frequency on the quantum theory. 

3. Draw the phase space for a rotator, as described in Sec. 213, and verify 
the quantum condition stated there. 

4. Apply the correspondence principle to the radiation from a linear 
oscillator. Show that the Fourier components of the classical motion are 
zero corresponding to all transitions except those in which the quantum 
number changes by one unit only. From this one may infer that in the 
quantum theory only this particular transition can occur, the probability 
of any other sort of transition being zero. Such a result is called a selection 

5. Consider the motion of Prob. 6, Chap. IX, in which a particle executed 
simple harmonic motion on a rotating turntable. Assume that one quantum 
number, and phase integral, is associated with the rapid frequency of oscilla- 
tion, and the other phase integral with the slower frequency of rotation of 
the turntable. From the Fourier analysis of the x component of motion, 
show that the only allowed transitions are those in which each quantum 
number changes by + 1 unit. Show further that both must change together, 
there being no transitions of one quantum number alone, but that a transi- 


tion of +1 unit in one of the quantum numbers is equally likely to be con- 
nection either with +1 or —1 of the other. 

6. Find Maxwell's distribution of velocities, stating that the number of 
molecules of a gas for which the velocity is between v and v + dv is propor- 
tional to 



To do this, use /x space, assume the Maxwell-Boltzmann distribution law. 
Consider a fixed point of space, so that x, y, z are constant, and we need only 
consider the three-dimensional momentum space. Note that the velocity 
is proportional to the radius Vp* 2 + Vv 2 + Vz 2 in the momentum space. 
The number of molecules between v and v + dv is then proportional to the 
density of molecules in the momentum space, which from the Maxwell- 
Boltzmann law is constant for constant v, times the volume of momentum 
space between v and v + dv, which can be computed from the ordinary 
geometrical relations of a sphere. Determine the constant factor in the 
law so that your formula will give directly the fraction of all molecules in 
the range dv. 

7. Find the mean kinetic energy of a molecule at temperature T. Note 
that the mean of any quantity F(p, q) is given by 

J = J^(P, g) f(P, g) dp • • • dq • • " ; 
If(P, Q) dp • • • dq • • • 

where f(p . . . q . . . ) is the density function in the phase space, and the 
integration is over all parts of the phase space. Note also that since in this 
case F depends only on the momentum, the integrals in numerator and 
denominator can be factored into one integral over the momenta, one over 
the coordinates, and that the latter cancel out. 

8. By integrating over all momenta, show that the space density of 
molecules in a gas is proportional to e _F / fcr . Apply this to the density of 
the atmosphere in the earth's gravitational field, assuming constant tem- 
perature. Find from this the rate of decrease of barometric pressure with 
altitude, at the earth's surface, assuming a reasonable atmospheric 

9. In the r space, consider a canonical ensemble of N identical molecules, 
where N is very large. Assume that no forces act. Find the number of 
systems of the ensemble for which the total energy is between definite limits 

■ E and E + dE. To do this, note that the energy is proportional to p 2 ^ + 
p2 yl _|_ . . . pi zN , or the square of the iV-dimensional radius of the momen- 
tum space, so that the part of the space between E and E + dE is the region 
between two corresponding hyperspheres. Note that the "volume" of a 
two-dimensional "sphere" (a circle) is ht 2 ; of a three-dimensional one, Ittt 3 ; 
of an iV-dimensional one, constant times r N . Also note that the volume 

j.j- d (volume) , 

between r and r + dr is t of. 


10. Show that the fraction of all systems is a canonical ensemble for 
which energy is between E and E + dE is approximately given by a Gaus- 
sian error curve, Ae - "^" ^. Find c and a. (Hint: The function found in 


Prob. 9 has a very sharp maximum, to be approximated, by the error curve 
above. Expand the logarithm of the function in power series about its 
maximum, a, so that the logarithm equals constant — c(E — a) 2 + •■ • • , 
where there is no first power term because the expansion is about the maxi- 
mum, and higher power terms than the second are to be neglected. Then 
the function is the logarithm of this power series, giving the value above. 
Show that the third and higher power terms are negligible unless E — a is so 
large that the function itself has sunk to a negligible value.) 

11. In the distribution of Prob. 10, show that the mean energy of the 
systems of the ensemble is just N times the energy of a single molecule, as 
found in Prob. 7. To get an idea of the range of distribution of energy 
about this mean, find the energy for which the Gaussian distribution curve 
falls to half its maximum value. Show that the energy difference between 
this value and the mean increases proportionally to y/N, but that the 
percentage deviation of the energy, or the deviation divided by the total 
energy, goes down as l/\/~N, so that for large N the percentage deviation 
is extremely small. 



Suppose we have a problem, like the linear oscillator, in which 
there are no motions which go to infinity; that is, in which every 
motion is quantized, so that only discrete energy values are 
allowed. Let the nth energy value be E n , the corresponding 
wave function u n . Then a general solution of the wave problem, 
involving the time, is 

X Cni 


h u n (x, y, z), (1) 

where we choose the negative exponential for reasons which 
will appear later. This function will shortly be derived as the 
solution of a wave equation involving the time, though we have 
not yet written down that equation. Now let us recall the mean- 
ing of yp. It is the amplitude of a wave whose intensity gives 
the probability that the particle be found at a given place at a 
given time. Since \p is complex, this intensity is given by 
multiplying by its conjugate; hence tyyp gives the desired proba- 
bility. Or more precisely, the probability that the particle, 
at time t, is in the volume element dxdydz, is \fydxdydz. 
One result appears at once from this: the probability that the 
particle be somewhere is unity, and this must be the sum of 
the probabilities that it be in all separate parts of space, or 
the integral of the probability over all space: 

JJ7# ^ dx dy dz = 1. (2) 

Now having the probability, we can proceed to get statistical 
information about the behavior of the particle. 

224. Mean Value of a Function of Coordinates.— As we have 
seen in the last chapter, the first step in a statistical investiga- 
tion is to find a distribution function. There we were interested 
in functions of coordinates and momenta of a particle or system, 
and we had a function f(qi, . . . q n , Pi • • • Pn), such that 
fdqi . . . dp n gave the number of systems having coordinates 



and momenta in the range dqi . . . dp n . To find the average 
of any quantity, given such a distribution, we proceed as follows : 
if the quantity is F{q x . . . p n ), a function of coordinates and 
momenta, we multiply the function by the fraction of systems 
having those particular q's and p's, and integrate over all q's 


and p's. This fraction is -jj-j - — — = — > so that the result is 

jf dqi . . . dp n 

jf dqi • • • dp n ' 

where we denote the average of F by F, to avoid confusion with 
the single bar indicating complex conjugates. Similarly in the 
present case we have a function ^ which is a distribution func- 
tion as far as coordinates are concerned: fypdxdydz is the 
probability (directly, since f$ \p dx dy dz = 1) that the particle 
have coordinates within dx dy dz. Thus if we have a function 
F(x, y, z) of the coordinates, and wish its mean value, we have 

F = fF$$dxdy dz = WF^dxdy dz, (4) 

where we prefer the latter method of writing it because it fits 
in with formulas which we shall have later. This does not tell 
us how to find averages of functions of the momenta, such as 
for example the energy; that is more complicated, and will be 
discussed in a later section. But we may wish, for instance 
with an atom or molecule, to find the mean value of the center 
of gravity, or moment of inertia, or some such function of posi- 
tion alone, and the formula suffices to determine it. 

It is now very interesting to substitute our expansion of yp 
in the expression for a mean value. That gives 

t = >,c tt c m e h Ju„,F u m dxdydz 



where by definition F nm = Jw n F u m dx dy dz. The quantities 
F nm form a two-dimensional array of numbers, of the sort known 
in mathematics as a matrix, and the individual F nm 's are called 
matrix components. 

225. Physical Meaning of Matrix Components. — Suppose we 
have an electron in an atom, and try to find its electric moment 
as a function of time; that is, its charge e times the displace- 


ment of the electron, x. In other words, we wish the mean 
value of ex, which is 

ex = '2jC n c m e h {ex)» m . (b; 

We observe that in the mean moment the terms depend on time 

^i(E —E)t 

through the expression e h , having the frequency (E n — 

E m )/h. But this is just the frequency which by the quantum 
theory the atom should emit in jumping between the energy 
levels m and n. Hence we connect this particular matrix compo- 
nent with this transition. By the correspondence principle, 
in Sec. 217, Chap. XXX, we have already seen that associated 
with each transition there is a classical frequency of oscillation, 
and a corresponding Fourier component of the motion. It 
can now be shown that this Fourier component, in the limit of 
large quantum numbers, becomes equal to the matrix component 
(ex) nm of the electric moment, which appears in Eq. (6). The 
individual terms of Eq. (6) act like oscillators, radiating energy, 
and it proves to be true, though it requires a difficult analysis 
to show it, that the rate of radiation of the oscillator determines 
exactly the quantum probability of transition. For example, 
if a matrix component is zero, there will be no radiation of the 
corresponding frequency, no transitions are possible between 
the stationary states concerned, and we have what is called a 
selection principle, a principle selecting out certain transitions 
which can occur, the rest being forbidden. 

The matrix components which we have noticed have been those 
where m and n were different. If we make a scheme of matrix 

components like 

F " 







. . . 


we see that the components F lh F 22 , etc., along the principal 
diagonal all have m = n, so that our components with m 9^ n 
are just the nondiagonal components. The diagonal compo- 
nents, however, have a different meaning. They refer to time 
average properties of the system, rather than to the sinusoidal 
properties which are connected with radiation. Thus if we 
take the time average of P (where the averaging in F refers to 


averaging over the probability distribution, not over time), 

^{E n -E m )t 

the exponential term e h averages to zero, unless n = m, 

in which case it is unity. Hence we have 

time average of P = ^c n c„F n n, (8) 


the double sum reducing to a single sum. Here, as we said above, 
only the diagonal components of the matrix of F appear. 

We can understand this formula better if we notice the mean- 
ing of the c's. To get at this, we observe that the c's are the 
amplitudes by which the various overtones are multiplied, in 
order to get the whole wave function. Thus the quantities c n c n , 
the squares of these amplitudes (taking account of the fact that 
they may be complex by multiplying by the conjugate) are 
quantities proportional to the intensities of the various overtones ; 
and the interpretation of this is that they are proportional 
to the probability that the particle be in a given stationary state. 
As a matter of fact, we shall soon show that c n c n represents just 

the probability itself, the sum of all the probabilities, ]£V„c„, 


being unity. Thus the formula 

time average of F = ^c n cJF nn 


means that F nn is the time average of F over the nth stationary 
state, and c n c n the probability of finding the system in this 
stationary state, so that we multiply together and add to get the 
average over all stationary states. 

226. Initial Conditions, and Determination of c's. — Just as 
with the problem of the vibrating string, we may have initial 
conditions: we may know that the distribution \p has a certain 
value at t = 0. Let us take a specific example: we may know 
that at t = the particle is inside a given small volume, though 
we do not know where in that volume. Then we may ask as 
to its probable later motion. That is, we know that ^(x, y, z, t) 
is zero, at t = 0, outside the small volume, and has a constant 
value, ' or at least a sinusoidal form with constant amplitude, 
inside the volume. Now at t = 0, the exponentials become 

unity, so that we have ^{x, y, z, 0) = ^c n u n (x, y,z). But this 


is just the familiar problem of expanding an arbitrary function 


of x, y, 2 in a series of functions u n . These are orthogonal func- 
tions; they are solutions of Schrodinger's equation, which is 
of the type already discussed in Prob. 10, Chap. XV, where we 
showed in general that the solutions were orthogonal. We 
assume them to be also normalized. Thus the c's are simply the 
coefficients of expansion, determined directly by multiplying by 
the corresponding normal function and integrating. We must be 
careful. of only one thing: our functions are now often complex, 
and when we multiply two such functions together, in such cases, 
it proves to be necessary always to multiply so that a function 
and a conjugate appear together. Thus we have 

fffy(x, y, z, 0) u m (x, y, z) dx dy dz = ^ c„ ju n u m dx dy dz. 


But now the orthogonality is such that ju n u m dx dy dz is 
unity if n = m, zero if n j£ m, so that we have 

Cm. = JV Um dV. (9) 

The physical situation is then this. If we know initially 
the distribution of coordinates, we can find a \j/ satisfying the 
conditions, and in general all the c's will be different from zero. 
That is, all overtones will be excited, or the system will be 
partly in each stationary state. We may say, if we choose, 
that we have an ensemble, and that a system of this ensemble 
has a probability c n c n of being in the nth state. If now we ask 
how ip changes with time, we can see that the particle will no 
longer have the initial distribution of probability, but that the 
probability will change with time. For example, if we originally 
know it is in a small volume, this will not continue to be true 
as time goes on; it will have a chance of moving out of the volume. 
The reason is that the different waves cooperate to give just 
the right function at t = 0, but they vibrate with different 
frequencies, and soon they get out of step, and can no longer 
cooperate properly. Thus a general wave function, made by 
superposing many stationary states, does not represent an ensem- 
ble independent of time, though a single wave function does. 
Though the probabilities as functions of the coordinates change 
with time, it is significant that the c's, being constants, do not. 
Thus the probability of finding the atom in a given stationary 
state does not change. The atoms do not go from one to another, 
and the states are really stationary. This is all true only 


if we neglect radiation, or external forces. If there is radiation, 
the whole situation will be altered, the c's will change with 
time, and the time rate of change of any cc will be interpreted 
as being connected with a corresponding probability that atoms 
are having transitions to or from this state. It is much as with 
vibrating strings: if the string is started off with a complicated 
shape, this shape will be soon changed, but if there is no friction 
we can analyze the motion into overtones, and each overtone 
preserves its amplitude. If friction is present, however, the 
overtones change their amplitudes. 

227. Mean Values of Functions of Momenta. — The method of 
finding mean values of functions of the coordinates is perfectly 
straightforward, but the treatment of the momenta is peculiar, 
and is one of the characteristic features of wave mechanics. The 
momentum shows itself in the wave function through the wave 
length of the wave, and in order to get information about wave 
length, it turns out that the proper procedure is to differentiate 
the wave function. We can find the correct formulas from a very 
simple case; and since we are setting up a theory which is not 
derived from any other, we can do nothing but postulate the 
general formulas, which prove to be the same ones that we find 
in this special case. Thus suppose we have a free particle in 
empty space, traveling with a momentum p, energy E. Its 

wave function, if it travels along the x axis, will be e~h , 

corresponding to the wave length 1/X = p/h. More generally, if 
its components of momentum along the three axes are p x , p y , 
p z , its wave function will be 


a plane wave. If we wanted to find the mean x momentum of this 
particle, we should multiply p x by the probability, and integrate; 
we should get p x , of course, since the mean value of a constant is 
the same constant. But the question is, how is this to be general- 
ized so that it can be used in more complicated cases, where the 
momentum does not appear explicitly, and is not constant? The 
answer proves to be the following : If our function is \f/, we observe 

that pr—. — equals p x \l/. Thus if we form the expression #:r— . ~^- 
2m dx 2-ki dx 

and integrate, the answer will be the same as integrating ^p^, 

(h a V 
K— . —J \p 

F = 


h r) 

would give p x 2 , and so on. In other words, the operator 7— . — > 

2iri ox 

operating on \p, and averaged, can be taken to stand for the x 

component of momentum. 

It is now assumed that this process can be applied in general. 

Thus with any wave function ^, the mean value of the x com- 

Jh d 
$k—- -5- 1 dv. Or more generally, if we 

have any function of momenta and coordinates, say F(x, y, z, p x , 
Pv> Pz), we have for the mean value 

jV F (x, y, z, ± £ £ Ty'h.Tz) + *• (1 1) 

This is the general rule, reducing to our former one when F 
involves only coordinates. There is one difficulty connected with 
this, however. It turns out that if there are any terms in F 
involving products of coordinates and momenta^ the answer will 
depend on the order in which they occur. The best example is 
the case of the product p x x. We have 

^ x = r[L-L^] dv 

h _ 

= as + *P" 


plx — xpl = s - .• (12) 

This is the so-called commutation rule ; it states that interchange, 
or commutation, of the order of a coordinate and momentum 
operator changes the value, since the difference is not zero. In 
most actual cases that we meet, we shall not be troubled by this 
difficulty of noncommutability of coordinates and momenta, but 
it is something against which we must be on our guard. 

We notice by analogy with what we have done that, taking the 

wave function of the form given above, — ^— . -£■ = E\l/. This 

2tti dt 


again is taken to be a general method of finding the energy of a 
wave function: 


— — s t 

If \j/ = 2jC m e h "u m (x, y, z), we evidently have 



"o""--^ = y^r»E m e h u m {x, y,z). 

2iri dt 


Multiplying by #, we have 

X r-(Mn-Mm)t_ 

c n c m E m e h u n u r 

-*p iB .-H~)t. 

Integrating over the coordinates, the nondiagonal terms drop 
out on account of orthogonality of the w's, and the rest reduces to 

E = ^C n C n E n , (13) 


a weighted mean of the energy of the various states. 

228. Schrodinger's Equation Including the Time. — We are 
now able to give a more general interpretation of Schrodinger's 
equation than was possible in Chap. XXIX. We start with the 
classical expression 

H{qi • • • Qn, pi • • • Pn) = E, 

where H is the Hamiltonian function, E is the total energy, and 
the equation represents the conservation of energy. But now 
suppose we try to replace each side by the corresponding quan- 
tum theory expression, so that we shall be able to allow each 
side to act on xp, and if we wish multiply by # and integrate to 
get averages. The first step is 

„/ JlJL JL JL h d \h - h -0t c[±\ 

H\qi • • • Q»>2ridqi 2ti dq 2 ' ' ' %ci dqj* 2wi dt' { } 

But this is just Schrodinger's equation, in the form involving the 
time (which we have not so far met). To show that it reduces 
to the form we have previously met, let us take the case of rectan- 
gular coordinates x, y, z. There 

# = 2^(P* 2 + p. 2 + p* 2 ) + F, 


so that the equation becomes 

87r 2 m\d:r 2 "*" dy 2 ~ r dz 2 / "*" _T 2« d<" 

In this, let us assume a solution \p = e h u(x, y, z), where E 
is a constant, to be identified at the proper time with the energy. 
Then the equation becomes 

(-£?' + v ) u - Eu - < 15 > 

which leads immediately to the form of Schrodinger's equation 
with which we are familiar. 

229. Some Theorems Regarding Matrices. — Suppose that we 
have an operator F, formed from a function of the q'a and p's 
by replacing the p's by differentiations, in the manner we have 
described. Then we have by definition F nm = JunFu m dv. But 
we can look at this in the following way. The tin's form a set of 
orthogonal unit vectors in function space. Fu m is a function 
different in general from any of the u's, and hence a different 
vector. The quantity JunFu m dv is the scalar product of Fu m 
with u n ; that is, it is the component of Fu m along the nth. axis. 
But this suggests writing a vector equation: 

Fu m = ^F nm U n , (16) 


expressing Fu m as a sum of unit vectors, each multiplied by the 
corresponding component. To prove this, we need only multiply 
by u n and integrate, when the right side, on account of orthogonal- 
ity, leaves only F nm . An example of such an expression is Schrod- 
inger's equation not involving the time, which can be written 

Hu n = EnU n , (17) 

if E n is the energy in the nth state. This obviously expresses 
the fact that the matrix of H has only diagonal components 
(H nm = E n if n = w, zero if n 9^ m), so that, since H has no 
nondiagonal components, it has no terms depending on time, or 
is a constant. 

It is interesting to write down the matrix of a constant, foi 
example a number C. Evidently ' C nm = ju n Cu m dv = C if 
n = m, if n 9^ m. A particular case is the matrix of unity, 
ju n u m dv = 1 if n = w, if n 9^ m, simply the orthogonality and 
normalization conditions. This matrix is often called 8 nm ; by 


definition 8 nm = 1 if n = m, Oifn^ra. In terms of this, we 
have C nm = C8 nm . And we can write the matrix of the energy as 

"urn = H'nVnm.' (loj 

This matrix equation, stating that the matrix of the energy is a 
diagonal matrix with the characteristic values E n , may be taken 
as a matrix statement of Schrodinger's equation; we readily see 
that it is just what would be obtained by multiplying Schrod- 
inger's equation by an arbitrary u m , and integrating. We shall 
actually use this matrix equation later in discussing perturbation 
theory. It is to be noted that a matrix depends on two things: 
first, the operator, and secondly, the set of orthogonal functions 
with respect to which it is computed. Thus a given operator, as 
energy or angular momentum or x coordinate, can have its 
matrix computed in any set of orthogonal functions. The prob- 
lem of solving Schrodinger's equation with a given energy opera- 
tor may be considered as that of finding the particular set of 
orthogonal functions which makes that operator diagonal. In 
a similar way we can find a set of orthogonal functions which 
would make any other desired operator have a diagonal matrix. 
We shall see in the next chapter that this involves us essentially 
in a rotation of axes in function space, similar to what we found 
in introducing normal coordinates in vibration problems. 

From our expansion of Fu m in series in the w„'s, we can easily 
get the method of multiplying matrices, which is very useful in 
matrix manipulation. Suppose that we have two operators F 
and G, and know the matrix components F nm and G nm . We can 
then find easily the matrix components of the product operator 
FG. For we have 

' mnv»m 

Gu n = ^G* 

FGu n = 2jG mn Fu m = ^GmnFkmUk ~ 2-A ZjFkmGmn )Uk. 
m m,k k m 

But also 

(FG)u n = ^{FG) kn u k , 

. k 

by the earlier formula. Hence 

*(FG)kn = Z^^^Gmn, ■ (19) 


the formula for multiplying matrices. 


It is a rather remarkable fact that the method of operating 
with matrices was discovered before the wave mechanics. This 
multiplication rule, and the commutation rule, were both devel- 
oped. They were used for a number of complicated calculations, 
without use of wave functions, for example for finding the energy- 
levels of the linear oscillator, its intensities of radiation, and even 
the energy levels of the hydrogen atom. For a few problems, as 
perturbation theory, the matrix method is still more convenient 
than the wave method, as we shall see. 


1. Prove that a coordinate commutes with another coordinate; a momen- 
tum commutes with another momentum; and a coordinate commutes with 
a momentum conjugate to another coordinate. 

2. Write down the operators for the three components of angular momen- 
tum in rectangular coordinates. 

h riJ? 

3. If F is any operator, prove that ^— = (HF - FH), where H is 

the Hamiltonian operator, the equation above to be regarded as either an 
operator or matrix equation. To prove it, take average values of the 
operators. Find the average value of F, differentiate it with respect to 
time, to get the left side of the expression. On the right, in computing 
the average values* use the multiplication rule to compute the matrices 
of HF and FH, noting that H has a diagonal matrix. Finally identify terms 
on both sides of the equation. 

4. Using the result of Prob. 3, prove that the time rate of change of the 
energy is zero; prove that H and t satisfy a commutation relation like a 
momentum and coordinate. 

6. Show that for the linear oscillator the assumptions 

E n = (n + h)hv 

Xnn = U 

= j Mn + l) MM 

Sn+l.n - X n , n +1 ~ \ 8ir 2 m»» 

x nm = if m y£ n ± 1 

satisfy the quantum mechanics. To do this, compute the matrix components 
of x nm , and find the matrix of the energy expression (m/2)(x 2 + 4a- Vx 2 ), 
computing the matrices of x 2 and x 2 by the multiplication rule. Show that 
this matrix is diagonal, its diagonal components being the energy values 
given above. 

6. By comparing with the wave functions of the linear oscillator in Chap. 
XXIV, Prob. 6, verify that the values of matrix components in Prob. 5 
are correct. If you cannot give a general proof, take the actual wave func- 
tions you have worked out, in Prob. 6, Chap. XXIX, using them for n = 0, 
1, 2, normalizing, and calculating the matrix components by direct 


7. Show that a linear oscillator radiating from the nth. stationary state 
cannot jump except to the (« — l)st state, so that there is a selection 
principle on its radiation. Compute the rate of radiation of the oscillator 
in the nth state, on the assumption that it is the same as that of a classical 

— -(E — E i)t 

oscillator whose charge is e, displacement is x n ,n-ie h * " ~ + 

— (.En-1-En)t 

x n -i, n e h . Compare this displacement with the displacement of 

a classical oscillator of energy E„, showing that in the limit of large quan- 
tum numbers both amplitude and frequency of the classical oscillator agree 
with the quantum values. This is an example of the correspondence 

8. Solve Schrodinger's equation for a rotator, whose kinetic energy is 
§/0 2 , in the absence of an external force. Find wave functions, showing 
that the angular momentum is an integral multiple of h/2ir. Compute the 
matrix of R cos 6, one component of displacement of a point attached to the 
rotator at a distance R from the axis. Show that all matrix components are 
zero except those in which the angular momentum changes by + 1 unit. 

9. Find what p 2 q —. qp* is equal to, using the commutation rule for pq — 


10. Show that e h u(x), where p is the £ component of momentum, a. is 
a constant, is equal to«(a; -J- a). Use Taylor's expansion of the exponential 

11. Write down Schrodinger's equation in spherical polar coordinates, 
by using the Laplacian in these coordinates, assuming a potential V(r). 
Discuss the method of deriving the equation from the Hamiltonian by 
replacing the momenta by differentiations, showing that the former method 
is consistent with the latter, but that the latter method does not lead to 
unique results. 


There are many problems in wave mechanics, which, though 
they, cannot be exactly solved, are approximated by soluble 
problems. Thus a nonlinear oscillator can be approximated 
by a linear one; or a system, as an atom, in an external electric 
or magnetic field can be approximated by the same system 
without the field. The perturbation theory is adapted to the 
solution of such problems, starting with the known approximate 
solution, and expanding in power series in the perturbation. 
At the same time, there are some problems of more general 
nature treated by perturbation theory. Thus the radiation 
of an atom can be examined by treating the interaction of the 
atom and a radiation field as a perturbation. We shall be led 
by such questions to a discussion of the transitions between 
stationary states. The actual method we shall use is closely 
analogous to the perturbation theory used with the nonuniform 
vibrating string. 

230. The Secular Equation of Perturbation Theory —Suppose 
that we wish to solve Schrodinger's equation Hu n = E n u n , 
where H is the given Hamiltonian. Let us start with a set of 
orthogonal functions u n °, which often are solutions of a similar 
problem approximating the real one, and let us expand the 
correct functions u n in series in the u n °'s: 

= %Sr, 


Then the problem may be regarded as that of finding the expan- 
sion coefficients S mn , which are really coefficients of a linear 
transformation in function space transforming from the original 
set of orthogonal functions to the final, correct, ones, so that we 
may expect the S'b to satisfy orthogonality and normalization 
conditions. We substitute this expression for u n in Schrodinger's 
equation, and get the condition for the coefficients. If we 
substitute, multiply by u k °, and integrate, we shall have only 



one term on the right, on account of orthogonality of the w 0, s; 
on the left, we shall have a linear combination, each term involv- 
ing a matrix component of H with respect to the w°'s, for example 
Hkm = Juk° Hu m ° dv. We recall that, since the u°'s are not 
solutions of the problem, this matrix will not be diagonal. 
Carrying out the substitution, we have 

7 .(Hkm ~ E n 8km)Smn = 0, (2) 


or an infinite set of equations for the infinite set of $ m7t 's. Writ- 
ing them for the nth stationary state, we have 

(ffn - E n )S ln + H 12 S 2n + H ls S Sn + • • • = (k = 1) 
H u Sm + (tf 22 - E n )S 2n + H^Szn + •■.=() (k = 2) (3) 

These equations are all homogeneous, of the same sort found 
whenever we have introduced normal coordinates or rotated 
axes, as, for example, in discussing coupled systems or the vibrat- 
ing string. As usual, the equations in general do not have a 
solution; they have one only if the determinant of coefficients 

H\\ — E n H\2 H\z 

Hi\ Hi% — E n H<lz . . . (4) 

is zero. This secular equation determines the energy levels. 

231. The Power Series Solution. — If the u°'s were solutions 
of the problem, H would have a diagonal matrix, the diagonal 
terms being the energy levels. Though this is not true,* let us 
assume that the u°'s are not far from solutions. Then by argu- 
ments of continuity the nondiagonal terms of H, though not 
zero, are small, and the diagonal terms, though not exactly 
the energy values E n of the exact solution, are not far from the 
correct values. Thus E n is approximately H nn . We assume 
the problem is nondegenerate, by which we mean that only one 
state has even approximately this same value. Now let us 
recall how to expand a determinant. We take products of 
terms, choosing just one from each row, one from each column. 
There are Nl ways of doing this, if the determinant has N rows 
and columns. We give each a sign + or — according to its 
requiring an even or an odd number of interchanges of rows 
or columns to bring the desired term to the principal diagonal. 
Finally we add, In this case, since we are dealing with small 


quantities, we look first for the largest product. This is plainly 
the principal diagonal, for the only large terms are those on the 
principal diagonal. For a first approximation we may set this 
equal to zero. It is already factored: (Hn — E n )(H 22 — 
E n ) ' • • — 0. One of the factors must be, to this approxima- 
tion, zero. Plainly it must be H nn — E n , since this is the only 
term which is even small, assuming the system is nondegenerate. 
This then is the first order approximation to the energy: E n = 
H nn , the diagonal component of the matrix of the energy with 
respect to the approximate wave function. 

Using the first-order approximation to the energy, we can 
easily get the corresponding linear transformation and wave 
functions. If the u°'s were the correct wave functions, we should 
have S nn = 1, all the other S's = 0. To a first approximation, 
in the actual case, we may set S nn — 1, but regard the other 
S's as small quantities. Then we have, for example, for the 
first equation 

(#11 — HnnJSln "f" ' " " + H \ n + * ' ' =0, 

where the terms we do not write are of a smaller order than 
those we write. Hence 

Sin = —jj : — rj—' (5) 

rL 11 — n nn 

The other equations are of the same form, so that the approximate 
wave function is 

u = u o_ ^ g *^° . y 6) 

For the second approximation to the energy, we must consider 
further terms in the determinant. We can proceed by analogy 
with the case of a determinant of two rows and two columns, 
which we should have if there were only two stationary states 
to consider. In this case the secular equation would be 

H\\ — E n Hl2 

"21 H22 — E n 

This is only a quadratic for E n , and can be immediately solved 

= (H„ - E n ) (# 22 - E n ) - H 12 H 21 = 0. (7) 

if 11 + H 22 

En = Hn + H^ ± ^ pn + H„ y _ ^^^ _ HuHn) 

± V( H " 2 H22 ) 2 + H " H "- W 


This explicit formula is analogous to the formula for the fre- 
quency of a system of two coupled oscillators, obtained in Eq. 4, 
Chap. XI. Here as there, if the nondiagonal matrix component 
#12 of the energy is small, we can expand the radical by the 
binomial theorem, obtaining without trouble for the two solu- 
tions as power series in Hu, 

J? - T7 m H^Ha i . . . . 

tl 11 — Zl22 

E 2 = H 22 + t^%- + • ' • (9) 

Zl22 — Jtl 11 

analogous to Eq. (5) of Chap. XI. Here, as there, the effect 
of the second-order perturbation terms is to push apart the two 
levels. Thus the first-order calculation alone gives E\ = H\[. 
The numerator of the fraction giving the second-order calculation, 
H12H21, is really a perfect square, for it can be shown that H<n = 
Hu (similar theorems hold in general for all the matrices of 
real quantities which we meet). Thus the numerator is positive. 
If #11 is greater than H22, so that the first-order level 1 lies above 
2, the denominator is also positive, so that the level is still 
further raised by this perturbation. On the other hand, for the 
other level, the denominator is negative, and the level is further 

The exact solution which we have obtained in Eq. (8) is only 
possible when the secular equation is simple enough to handle 
algebraically. The approximations (9), however, can be found 
directly from the secular equation (7). Tims let us consider 
Ei. We assume that the equation is not degenerate, so that 
H22 — Ei is not a small quantity, and we may divide Eq. (7) 

by it. Thus we have 

_ H12H21 

tin — Jhi — -fj pT' 

■"22 — tii\ 

Replacing the Ei in the denominator by its value H n , which is 
correct to the first order, this becomes 

TP - Tf _L ^ 12 ^ 21 , 

&1 — tin T it : tT~ j 

till — tl22 

agreeing with Eq. (9). By a little consideration of the deter- 
minant, exactly a similar discussion can be given in the general 
case. And the result proves to be simply 

E n = H nn ~ / \jj Tj * t (10) 

r ' tikk — tlnn 


in agreement with the special case solved above. It is very 
rarely that further approximations than we have given are used, 
for either the energy or the wave function. 

232. Perturbation Theory for Degenerate Systems. — We shall 
often meet cases in which the unperturbed problem is degenerate; 
that is, where the diagonal energies H nn of several states are 
almost exactly equal to each other. In this case, the power 
series method evidently does not work; the differences of energy 
which appear in the denominator of the terms in Eq. (9) or (10) 
become zero, or very small, and the series diverge and even have 
infinite terms. If there were only two levels, as in the special 
case taken up in the last section, we could solve the problem 
explicitly, not using the power series at all. Thus if Hn — # 22 = 
0, Eq. (8) gives 

E = Hn ± H 12 , (11) 

an important formula for perturbations of degenerate systems. 
With a finite number of degenerate levels, we have a secular 
equation of finite degree, and while we cannot solve it as con- 
veniently as the quadratic, still we can approximate its solutions, 
even for the degenerate case where the differences of diagonal 
energies are smaller than the nondiagonal energy terms. Now 
it fortunately happens that in many problems in which degen- 
eracy enters, as in atomic spectra, the levels fall into groups, 
the energies of all the levels in a group being about the same, 
but the different groups being well separated in energy. Such 
groups of levels are the multiplets in atomic spectra. In these 
cases we first solve the problem of the levels within a group, 
finding an exact solution for the finite secular equation. This 
solution gives us not only energy levels, but also coefficients 
of linear combinations transforming the original wave functions 
of the group into a new set which has the property that it makes 
the matrix components of the energy diagonal, with respect 
to the states of this group. We then use these transformed 
functions as the starting point of a new perturbation calculation, 
in which perturbations between adjacent groups are considered. 
In terms of these transformed functions, the energy will have no 
nondiagonal components between levels which lie close to each 
other, in the groups, but only between levels in different groups, 
at a considerable energy distance apart. Thus we may use 
the series method of Eq. (10), and the second-order terms will 


be small, since the only terms of the summation for which the 
denominator is small will have numerators equal to zero. It 
is to be particularly noted in this discussion that the difficulty 
in applying the power series method to degenerate systems arises, 
not on account of any unusual size of the nondiagonal energy 
components, but on account of the unusually small energy differ- 
ences between diagonal terms. The method converges only 
if the nondiagonal component between any two levels is small 
compared with the difference of diagonal energies of the two 
terms. This demands that before applying the power series 
method the nondiagonal terms between degenerate levels be 
removed, but it imposes no such requirement on the terms 
between levels of quite different energy. 

We can see more clearly what is happening from a mechanical 
analogy. Suppose we have a large number of mechanical 
oscillators coupled together, all having different natural fre- 
quencies, except the first two, which have unperturbed fre- 
quencies exactly, or almost exactly, the same. In considering 
the interaction, the effect of the two of equal frequency on each 
other will be large, since each one resonates with the other, 
but the others will have much less effect. We, therefore, first 
solve only the interaction of these two resonating oscillators, 
introducing normal coordinates for them. Then we can proceed 
with the discussion of the interaction, treating the effect of the 
other oscillators, not on these two oscillators individually, but 
on the two normal coordinates representing them. Of course, 
if there are several groups of degenerate levels, we introduce 
changes of variables inside each group first, then apply the 
ordinary perturbation theory. We shall have many examples 
of degenerate systems in our discussion of atomic structure, 
where nearly every energy level of an unperturbed atom is 
degenerate, and is split up by an external perturbing field, 
as an electric or magnetic field. In more complicated atoms, the 
perturbing fields come from within the atom itself, being inter- 
actions of one part on another, producing the multiplet structure. 
In actual practice, we shall find the study of degenerate systems 
very important. 

233. The Method of Variation of Constants. — A slightly 
different point of view in perturbations is obtained by consider- 
ing the time variation. Let us expand \p, the correct wave 
Junction depending on time, in series in the unperturbed functions 


u°: ip = ^\C m (t)u m °(x), where the Cs — functions of time — 


would be pure exponentials, c n e h " , if the w°'s were the correct 
solutions of the problem. Whether correct or not, we can always 
make the expansion above, for at any instant $ can be expressed 
in series in the orthogonal functions u°, the coefficients being 
functions of time. Now let us try to satisfy the equation 

Hi, = -£-.% We have 


Multiplying by u k ° and integrating, the result is 

If ~ "" T2i km m - ( } 


These equations for the time derivatives of the Cs in terms of 
their instantaneous values are enough to determine the complete 
solution of the problem. 

To make connection with the ordinary method, we need only 

assume C m = S mn e h " , an exponential solution. Then 
immediately we have, canceling the exponential, and the factor 
— 2iri/h, 

E n Skn = 7 MkmSr, 

> mny 

or exactly the equation we have previously used. In more 
general cases, however, it is not always possible to make this 
assumption. An example is that in which the perturbative 
force depends on the time. 

234. External Radiation Field. — The most interesting example 
of the method of variation of constants is the perturbation by an 
external radiation field, for this actually produces transitions 
between stationary states. First let us look a moment at the 
physical side of the problem, so as to understand what we expect 
to obtain from the calculations. An ordinary radiation field is 
never exactly sinusoidal; its amplitude, at a given point of space, 
as function of time, may be analyzed in Fourier series of very 
long period, as in Sec. 185, Chap. XXV. If the field is approxi- 
mately monochromatic of frequency v , that means that only 


frequencies in the neighborhood of v will have large amplitudes 
in the Fourier representation. On the other hand, if it is con- 
tinuous radiation, as the radiation from hot solids, there will be 
considerable amplitude in all frequencies, at least over a certain 
region. We assume the latter case. The electric field in the x 

direction at a given point will then be V#„ cos 2r(vt — a v ), 


where E v , a v , are amplitude and phase of the component of fre- 
quency v, and where we have components of frequencies differing 
by small increments dv = 1/T, where T is the fundamental 
period. The phases a v of successive components may be treated 
as being statistically independent of each other; that is, if we 
take any two components, the chance that the phase angle 
between them at any instant should have one value is just equal 
to the chance that it have another value. The values of E v will 
be treated as functions of v, though a somewhat more general 
treatment subjects them to probability laws too. Now we are 
interested in finding p v dv, the energy per unit volume in the 
frequency range dv. Since one component of the series is asso- 
ciated with the range dv = 1/T, we can simply find the energy of 
this component. For the x component of electric field, this is 

o~[2E p 2 cos 2 2ir(vt — a,)], the factor 2 taking account of the mag- 
netic field as well as the electric field. The time average of this 
term is E p 2 /(8ir). If we are dealing with radiation having equal 
intensities in all directions, the mean energy per unit volume 
associated with x, y, and z coordinates will be equal. Hence we 

Pv dv = ^E v \ (13) 

235. Einstein's Probability Coefficients. — Now suppose a 
radiation field of the type we have described is allowed to act on 
an atomic system. Einstein was the first to solve this problem. 
He assumed that, if the atom is in its rath state, there will be the 
following probabilities of transition to other states, induced by 
the radiation field: 

1. A probability A mn of radiating spontaneously to each state 
n which is of lower energy than the rath, with emission of the 
corresponding photon of frequency v mn > given by E m — E n = 
hv mn . This spontaneous emission corresponds to the ordinary 


radiation of an oscillating dipole in classical electromagnetic 

2. A probability B mn pmn of absorbing a photon of frequency 
v mn from the radiation field, where now the state n has higher 
energy than m, and of jumping up to the state n. This probabil- 
ity is proportional to the energy density p mn at the particular 
frequency v mn in the external radiation field. 

3. A probability B mn pmn, where now the nth. state lies below the 
rath, of emitting a photon of frequency v mn , and falling to the 
lower state, under action of the radiation. This is called induced 
or forced emission. 

Einstein assumed that the following relations held between 
the A's and B's corresponding to any transition n — m, where 
E m > E n : B mn = B nm , and A mn /B mn = Sirhv 3 mn /c 3 . Assuming 
these simple laws, he could then give a very elementary deriva- 
tion of Planck's law of black-body radiation. Let us assume that 
we have a piece of matter containing many kinds of atoms, so 
as to have some capable of emitting and absorbing each fre- 
quency. Consider a particular set of atoms having a lower 
state 1, an upper state 2, and assume that at temperature T the 
number of atoms in the upper state is to the number in the lower 
state as e~ Ei/kT is to e~ El/kT , or the Maxwell-Boltzmann distribu- 
tion law. Now we ask, what intensity, or energy density, in 
the external radiation field must we have to be in equilibrium 
with these atoms? If we can find this for each frequency of 
radiation, we shall necessarily have the distribution of intensity 
in radiation in equilibrium with matter at temperature T, which 
is what Planck's law gives. Let Nz be the number of atoms in 
the second state, JVi in the first, so that N2/N1 = e -(*^-*i)/*r = 
e ~hv/kT } w here v is the frequency emitted or absorbed by the atom 
in its transition. Now we know that the number of atoms leaving, 
the second state per second is equal to the sum of the following: 

1. The number leaving on account of spontaneous radiation, 

or JV2A21. 

2. The number entering on account of absorption from the 

lower state, or — N1B12P12. 

3. The number leaving on account of induced emission, or 
N2B21P12. This sum must be zero, in a steady state where the 
N's are constant. Hence 

Ni(An + B21P12) — NiBiipn. 


Using the relation between the A's and J5's, this is 

iV 2 5i 2 (pi2 + ~^-J = NiB 12 pi2. 

Setting N2/N1 = e~ hv/kT , canceling B 12, and solving for pi 2 , we 

Sirhv* 1 

P12 = 

,3 ghv/kT ^ 


which is Planck's law of black-body radiation. 

236. Method of Deriving the Probability Coefficients.— Ein- 
stein's coefficient A is often derived by analogy with classical 
theory as follows: In Chap. XXXI we have seen that the matrix 
components of electric moment are connected with probabilities 
of radiation. Thus, if the amplitude of the component of 
moment of the atom corresponding to the transition 2 — 1 is C, 

the corresponding classical rate of radiation is — ^ — . We can 
write this component in terms of the matrices as follows: corre- 
sponding to this frequency, we have the terms (ex) 12 e h ' + 
(ex) 21 e h = 2(ex)i2 cos 2irvt, where hv = E 2 — Ex. Thus 

C = 2(ex)i2, and the rate of radiation is Q 3 ' But an 


atom with a probability A 21 of radiating a photon of energy hv 

is radiating on the average at the rate of A 2J1 v per second. Hence 

we must set this equal to the rate of radiation above, giving 

_ 64tt 4 M 2 12 ^ c 3 W(ex)\2 

Ml Zc^T~ ' Bil ~ A21 ^h? ~ ~ZhT~' (15) 

The argument given above is hardly a derivation; it is merely 
suggestive. To get a real derivation of the probabilities, we use 
the method of perturbations. We shall find, for a reason to be 
discussed in a later section, that we can only obtain the J3's by 
this method. We shall assume that at t = the atom is definitely 
in the mth state; that is, c m ° = 1, all other c°'s are zero, where 
the c's are the coefficients in the expansion of the wave function 
\{/ in terms of the unperturbed stationary states, so that c n c n is the 
probability of finding the system in the nth state, and the c°'s 
are the values when t = 0. Then we shall investigate the time 
variation of the c's by the method of variation of constants, 
and it will appear that the c's for n different from m increase 


linearly with time, so long as we consider only small intervals 
of time and small perturbations, the term c m c m correspondingly 
decreasing. This we interpret as a definite probability that the 
system will leave the mth state and go to the nth; in fact we 
shall find c„c n equal exactly to B mn p mn t, as far as the variation 
is linear with time. By comparing this expression with the 
derived values of c n c n , we can evaluate the Z?'s directly from per- 
turbation theory. 

237. Application of Perturbation Theory. — Let the Hamiltonian 
of the system without radiation be H°, and assume that the 
unperturbed problem can be solved exactly: 

Let the perturbed Hamiltonian be H° — ex^E v cos 2it(vt — a„), 


the second term representing the potential of the force of the 
field represented by the summation, on the charge e. Under 
the action of the perturbation, let the perturbed wave function 
be ^ = ^.C m (t)u m °(x). Our task now is to find the C's. Using 


the method of variation of constants, noting that H° has a 
diagonal matrix, we have 

dC n _ 27ri^S? u p 

If ~ —Y2j Hn ^ k 

= --^H nn °C n + -J-^(ex) nk C k ^E v cos %r(vt - a v ). 

k v 

Now let C» = c n {t)e~~t HnnH , where c n (t) would be constant in 
the absence of an external field. Writing the field in exponential 
form, and letting H° nn - H kk ° = hv nk , this gives 


= ^'2w,v c*2f ,eiKi r t,H " 1 + e ~ M[{v ' v k)t ^^ 

If the external field were not present, we plainly would have 
dc n /dt = 0; if there is a small field, the time derivative will be 
small, or, in other words, the c's will be approximately constant. 
To a first approximation we may assume on the right side that 
the c's are exactly constant, having the values c° which they had 
at t = 0. If this is so, we may integrate directly, obtaining 


C "~ c -° = • W* 

- c -° -X 

<^ %h L \ v + y Bft / 

„ . /*> — 2iri(»<— v n k)t l\ | 

~^( ->.. )} (16) 
Now let us take the case we have discussed, where at £ = 
we have c TO ° = 1, all the other c's zero. Then for any n 5^ ra, we 
have only the single term of the summation above for which 
k = m. Next we find c n c n . In this, we have a product of two 
sums over v, which is, therefore, a double sum. Each such 
term for which we have different frequencies in the two factors 
has a term e -«(«v-«/) > WQ ich, on account of the random nature 
of the phases, is as likely to be positive as negative, and on the 
average cancels. Thus we are left with only the squares of the 
individual terms, in which the a's drop out. Further, each of 
these squares has terms whose denominators are respectively 
+ v nm ) 2 , + v nm )( v - v nm ), and - v nm ) 2 . The frequency 
v nm is so defined that it is positive if the nth state lies above the 
rath, which we assume to be the case for the moment. When 
v becomes nearly equal to v nm , the term (v - v nm ) 2 is very small, 
the term with this as denominator very large. Since v is always 
positive, it is not possible for the other terms, involving v + v nm 
in the denominator, to become so large. To an approximation, 
then, we neglect all terms except the last, obtaining 

LnCn i-iq /.Hi„ ; — i 

w ^ (v- Vnm y 

= (ex) 2 nm ^ E 2 [1 - cos 2tt (v - v nm )t] 

2h 2 ^-L~" (v-v nm ) 

" ~1*-2jF- — („ - Vnm y ~ ' (17) 

The formula we have just derived is decidedly significant. 
It gives essentially the probability that the system will go, in 
time t, from state ra to state n, under the action of the radiation. 
For a particular frequency v, this probability is seen to be propor- 
tional to E 2 ) that is, to the intensity of the incident radiation; 
and to (ex)\ m , the square of the matrix of the electric moment 


connected with this particular transition, which we should expect. 
But in addition, there is a dependence on frequency. If we plot 
c n c n , at time t, against v, the impressed frequency, we get a narrow 
peak with small side bands, centering at v nm , just like the pattern 
found in Fraunhofer diffraction. Thus, if the impressed fre- 
quency is close to the absorption frequency v nm , there will be a 
large probability of transition, while if it is farther away, the 
probability will be smaller. If the perturbation acts only for a 
small time, the band will be broad, indicating that many fre- 
quencies can cause the transition, but if the time is long enough, 
practically only the frequency v nm can cause the transition; the 
absorption curve of the substance, in other words, will have a 
sharp absorption line corresponding to the various transitions 
from the state m to other states n, as calculated by the quantum 

In carrying out the summation over v, it is evident that the 
essential contributions will come for frequency v very close to 
v nm . In this region, we may replace E v by its value at v nm , which 
we have already seen to be given by SE\ nm /8ir = p nm dv. Hence 
the summation reduces to an integration, 


8T(ex) 2 nm Psin 2 ir(v — v nm )t , . a v 

= —3/^ Pnm J {v - Vnm y dv - (18) 

The integration should properly be taken from v = to infinity. 
But since the integrand is large only in the immediate neighbor- 
hood of v = vnmi we shall make a negligible error if we integrate 

J sin 2 z 
■ — y~ dz, where 
-00 Z 

z = ,r( v - Vnm )t. This can be easily evaluated, giving ttH. 
Thus we have finally 

o7T {eX) nm . ('IQ') 

C n Cn — qi,2 Pnmt>) V-*-"/ 

or Bnmpnmt, where B nm is as given before. Thus we have verified 
our earlier statement regarding the probability coefficients B. 
A simple variation of the argument applies to states n of lower 
energy than the state m, resulting in the probability of forced 
emission, and if we compute c m c m , we find that the number of 
systems in the wth state decreases at a rate to compensate the 
increase in the other states. This can be shown easily on 
general grounds as well as by direct computation, for it can be 


shown that the sum of the quantities c n c n for all states remains 

238. Spontaneous Radiation and Coupled Systems. — The 
calculation we have just given did not lead to the probability 
of spontaneous emission A nm . An attempt might be made to 
include it by adding to the external force a radiation resistance 
term, depending on the velocity of the electron, but this method 
proves not to lead to the right answer. The proper treatment, 
as a matter of fact, must be sought in a different direction. 
We treat the radiation field, not as a perturbation, but as part 
of the system. It is possible to apply the quantum theory 
directly to the field by itself. For instance, if the radiation is 
confined in a rectangular box with perfectly reflecting walls, the 
electromagnetic field inside consists of a set of standing waves, 
of all the wave lengths allowed for a vibrating solid of the corre- 
sponding size, and with corresponding frequencies. We can 
now introduce normal coordinates, each corresponding to one 
mode of vibration, and the classical equations of motion of these 
normal coordinates are just like those of a linear oscillator. 
In a corresponding way, in wave mechanics, we treat these 
normal coordinates, set up a wave equation for each, and find 
that each one is quantized, with energy (n + %)hv, where v is the 
frequency of the wave, n a quantum number associated with 
this particular mode of vibration. A change of this quantum 
number by unity corresponds to an increase or decrease of the 
energy of the radiation field by one unit hv, and this we identify 
with the creation or destruction of a photon of this energy, by 
interaction with matter. 

Next we treat the atomic system just as if the radiation were 
not present. In this case, the atom will continuously stay 
in the same stationary state, and similarly the radiation field 
will always keep the same quantum numbers, meaning that no 
photons are being created or destroyed. But finally we introduce 
into the complete system of atoms and radiation a perturbation, 
corresponding to the potential of the atom in the radiation 
field (including the vector as well as scalar potential). This 
couples the two systems together, and under the influence of 
the perturbation transitions are possible, in which the atoms 
gain or lose energy in passing between stationary states, and 
the radiation field loses or gains an equal energy, which appears 
as destruction or creation of corresponding photons, or decrease 


or increase of the quantum number of the proper normal vibration 
of the radiation. When the probability of these processes is 
investigated, by the method of variation of constants, it is found 
that we obtain not only the probability of forced absorption 
or emission, Bp, but also the probability of spontaneous emission 
A. It is not hard by this method to investigate other questions 
as well, as for instance the breadths of absorption or emission 
lines — the question of just what frequencies of light can interact 
with a given atomic system. The general result is that, the 
shorter the life of an atom in either the upper or lower state 
associated with a transition, the broader the corresponding 
absorption or emission line. 

It is interesting to look a little more closely at the sort of 
perturbation problem we meet in considering spontaneous 
radiation, for example. Suppose we start with the atom in an 
excited state, and with no energy in the radiation field. Then, 
after the transition, the atom will be in its normal state, having 
lost energy, and the radiation will be in an excited state, having 
gained the corresponding energy. The total energy of the sys- 
tem will be the same in either case. Now neither one of these 
situations is a steady state, for neither one persists indefinitely. 
Both are approximate steady states, corresponding to the same 
energy. The perturbation problem, then, is one in perturbations 
of a degenerate system, having two equal energy levels. We have 
seen that such a perturbation problem leads to mathematics 
just like two coupled mechanical systems, as two pendulums, and 
it is convenient to use the mechanical language in describing what 
happens. Our present problem is like two pendulums of equal 
period (corresponding to the equal energy levels), coupled 
together. If the first pendulum vibrates alone, that corresponds 
to the state in which the atom is excited; if the second vibrates, 
it corresponds to the radiation being excited. But neither of 
these mechanical motions can occur by itself; if we start one 
pendulum vibrating, in time it comes to rest, and the other 
takes up all the energy. This corresponds to the fact that 
the system gradually changes so that the atom is in its normal 
state, the radiation excited. There is a flaw in our analogy, 
however: the energy in the mechanical case goes back to the first 
pendulum, while the atom does not come back to the excited 
state. The answer to this difficulty is easily given. The radia- 
tion field actually has not one mode of motion only, but many, 


all of about the same energy, all capable of interacting with 
the atom. Thus the emitted photon can travel in any direction, 
and not only that, photons of many different energies, all in 
the neighborhood of the energy ordinarily emitted by the atom, 
can interact, on account of the finite breadth of the spectral 
line. Thus while the situation where the atom is excited, and 
the radiation is in its normal state, is just one state, there are 
a great number of states corresponding to the other situation. 
It is as if our one pendulum corresponding to the excited atom, 
interacted with a great, or even infinite, number corresponding 
to the excited radiation. In these circumstances, the mechanical 
energy originally in the first pendulum would soon become 
dissipated, scattered through the others, and it will never happen 
all to come back to the first one, though a little might. Physi- 
cally, the radiation emitted by the atom travels to a great 
distance, and is very unlikely ever to find its way back to the 
atom which sent it out. But if the whole thing is enclosed in 
a box with reflecting walls, there will be a certain chance, finite 
though small, that the radiation will be eventually reflected 
back to the atom and absorbed. 

One significant feature of the situation is that there are real 
stationary states for the system of atom plus radiation. This 
follows directly from the fact that we can solve the perturbation 
problem. Just as with the coupled pendulums, there are normal 
coordinates, consisting of combinations of the various separate 
coordinates. Thus, there is some combination of all the various 
probabilities of the atom being in various states of excitation and 
the radiation field being in corresponding states which could 
persist indefinitely, and is thus a stationary state. The things 
we ordinarily think of as stationary states are combinations of 
these, just as the state where one pendulum is excited, the other 
at rest, is a combination of the two normal coordinates, with 
definite amplitudes and phases. These are really not stationary 
states at all, for they change with time. In any such problem, 
there are two equally good methods of treatment: first, we may 
use the unperturbed states which physically seem like stationary 
states, treating the perturbations between them by variation of 
constants, and so introducing apparent transitions into the 
problem ; or secondly, we may introduce the real stationary states, 
by the ordinary perturbation theory, introducing the correct 
initial conditions, and following what happens as time goes on, 


without having any transitions at all between these real station- 
ary states. This point of view is very illuminating, for it shows 
us that the only distinction between stationary states and transi- 
tions is largely artificial, determined by the original unperturbed 
wave functions in which we choose to discuss the system. 

239. Applications of Coupled Systems to Radioactivity and 
Electronic Collisions. — Many other problems of transitions can 
be looked at from the same point of view we have just used in 
discussing radiation. One example is the radioactive disinte- 
gration, which we have considered in Chap. XXIX, Fig. 64. We 
might take as approximate stationary states first the discrete 
states of a particle within the finite depression, second the con- 
tinuous states of the particle outside. If the barriers were 
infinitely high, there would be no transitions between them, but 
if the barrier is finite, we may start with a particle within the 
nucleus, and consider that it has a certain probability of a transi- 
tion to a state of equal energy outside the barrier. This could 
be treated by the perturbation theory of degenerate systems, 
where we could find the probability of leaking out by variation 
of constants, or alternatively could get approximations to the 
actual stationary states of the system. In this case, as with 
radiation, the probability of the particle coming back and getting 
back into the nucleus again, though small, is finite, if the system 
is enclosed in a finite box. Here the stationary states which are 
combinations of solutions for the discrete and continuous regions 
are perfectly reasonable and natural, and the more accurate way 
of solving the problem would be to determine these stationary 
states by the Wentzel-Kramers-Brillouin method, and build up a 
wave packet at t = corresponding to having all the distribution 
inside the nucleus, and asking how this packet spreads out as 
time goes on, though without change of real stationary states. 

Another similar problem is that of collisions, either elastic 
or inelastic. Suppose that an electron collides with an atom, 
being scattered either without change of energy, or with decrease 
or increase of energy corresponding to raising or lowering the 
energy of the atom. We can start with a number of unperturbed, 
not quite stationary, states: first, the electron approaching the 
atom, with the atom in its original state; secondly, an electron 
being scattered, say in a definite direction, or better with some 
function of angle represented by a spherical harmonic, with its 
initial energy, the atom being unchanged; thirdly, an electron 


scattered with a decrease of energy corresponding to a transition 
of the atom, with the atom in the correspondingly excited state; 
fourthly, an electron scattered with increase of energy, the atom 
being in a lower state, after what is called a collision of the second 
kind. All these states have the same energy, so that the pertur- 
bation problem between them, resulting from the fact that they 
are not solutions of the problem in the region where the electron 
is in the atom, is one of transition between systems of the same 
energy. Here, as before, it is often convenient to proceed by the 
method of variation of constants, and from this we get the prob- 
abilities of the various elastic and inelastic impacts. One thing 
is worth noting in all these problems : in the method of variation 
of constants, the quantity determining the probability of transi- 
tion is the nondiagonal matrix component of the perturbing 
energy between the different approximately stationary states. 
Thus the calculation resolves itself into a computation of these 
matrix components, and transitions are likely for which the matrix 
components are large. In our radiation problem, the matrix 
components in question were those of the electrical energy, involv- 
ing directly the matrix components of electric moment of the atom. 
While the perturbation method can be used for discussing 
collisions, it is not very accurate, on account of the large perturba- 
tions which the colliding electron exerts on the atom during the 
instant of collision. Fortunately, at least in the case of elastic 
collisions, much better approximation methods are available. 
As we shall see later, an atom acts on an electron very much 
like a central field of force, and the problem of the scattering of 
an electron by a central field is merely the special case of the 
central field problem, discussed in the next chapter, which we 
meet if the electron is in a continuous rather than a quantized 
energy level. By analogy with the results which we shall obtain 
in Sec. 241, the wave function of an electron in a central field is 
a product of spherical harmonics of angle, times a certain func- 
tion of r, and for an electron coming from infinity, this function of 
r is of the form shown in Fig. 62, satisfying a definite boundary 
condition at the center of the atorii, but becoming sinusoidal 
for large values of r. By combining an infinite number of such 
solutions, all corresponding to the same energy, but with different 
functions of angles, it can be shown that we can make the result- 
ant wave at large distances approach a plane wave, representing 
a stream of electrons traveling in a definite direction. But the 


functions are such that, if the central field is not vanishingly 
small, it is not possible to build up exactly a plane wave. 
Instead, there are certain terms left over representing spherical 
waves traveling outward from the center of force, with amplitudes 
proportional to 1/r, so that they are negligible compared to the 
plane waves at sufficiently large distances. These spherical 
wuves represent the elastically scattered electrons. 

Twc particularly interesting features of the elastic scattering 
cai be investigated by the method just described. First, one 
m&7 find the total intensity in the scattered wave, which can be 
pro , : !ci to be equal to the total intensity removed from the plane 
waw by its passage over the atom. This gives the probability 
that tn electron will be scattered by the atom, and it proves to 
increase as the atomic number of the atom increases, and to 
depen i in a complicated way on the speed of the electron. This 
dependence is so complicated that in some cases, called the Ram- 
sauer effect, very slow electrons have abnormally small probabil- 
ities of being scattered, and practically pass through the atom 
without hindrance. The probability of scattering is often 
described by defining an effective cross section for the atom, a 
cross section such that if all electrons striking it were scattered, 
and all passing around it were not, the probability of scattering 
would agree with the observed value. Plainly the effective 
cross section depends on electron velocity and on the nature of 
the atom. The second interesting feature of elastic scattering 
is the angular distribution of the scattered electrons, determined 
by the relative probabilities of scattering with the various spher- 
ical harmonic functions of angle. This again can show a com- 
plicated dependence on electron velocity and atomic constitution. 


1. Prove that if both unperturbed and perturbed functions, u n ° and u n , 
are orthogonal and normalized, the transformation coefficients S mn satisfy 
the orthogonality and normalization conditions. 

2. Show that if we expand the correct wave functions in a series of func- 
tions which are not exactly orthogonal or normalized, the equations for 
the transformation coefficients S mn arc 

2,(Hkm — E n dkm)Smn = 0, 

where dkm = Juk°Um° dv, which now is not diagonal and is not equal to dkm- 

3. Consider a degenerate system in which there are two unperturbed 

wave functions, having equal diagonal energies Hn = H 2i , which are nor- 


malized but not orthogonal to each other, so that /■Ui°W2° dv = d i2 ?* 0. 

Hu + H21 Hu — H21 

Show that the two energy levels are 

4. Show that the two correct wave functions in Prob. 3 are 

1 + di2 1 — di2 

U!° + U 2 ° 

V2(l +d) 

H-0 y 

— 7 j respectively. Prove them to be normalized and orthogonal. 

V2(l — d) 

6. Solve the problem of a system with two degenerate unperturbed levels 
of the same energy, by the method of variation of constants. Show that 
the equations for the time derivatives of the c's can be solved by assuming 
an exponential or sinusoidal solution. Show that the final solution is a 
pulsation from one state to the other, the frequency of pulsation being 

6. Prove by perturbation theory that the energy levels of a linear oscil- 
lator are not affected by a constant external field, except in absolute value, 
all being shifted up or down together. Why should this be expected 

7. Find whether a rotator's energy is affected, to the first or higher orders 
of approximation, by a constant external field in the plane of the rotator. 

8. Prove in Einstein's derivation of Planck's radiation law that B 12 = B 2 i, 
by considering equilibrium in the limiting case of extremely high tempera- 
ture, noting that in this limit the probability of forced transition is large 
compared with that of spontaneous transition, on account of the large 
density of radiation. 

9. Prove directly from Schrodinger's equation that the sum ^^c n c n always 


remains constant. 

10. For the problem of interaction of atoms and radiation, when the atom 
starts in the wth state, work out c m c m as a function of time, and show that 
this, added to the other c„c„'s, gives a constant. 


In the preceding chapters we have been discussing the general 
principles and methods of wave mechanics. We have seen that 
from wave mechanics one can derive ordinary Newtonian mechan- 
ics as a special case. But by far the most interesting mechanical 
problem which demands wave mechanics for its solution is the 
structure of atoms, molecules, and matter in general. We shall 
accordingly devote the remaining chapters of this book to the 
structure of matter. This is a problem which is doubly interest- 
ing; first, as a most important subject in itself, secondly, as the 
finest illustration of wave mechanics. 

240. The Atom and Its Nucleus. — An atom consists of a 
nucleus, and a number of electrons. All electrons are alike, 
electrified particles of negative charge — e = —4.774 X 10 -10 
e.s.u., mass of 9.00 X 10 -28 gm. Nuclei are heavier, and posi- 
tively charged. The charges on nuclei are found in every case 
to be integral multiples of the charge e. Thus a nucleus may 
have a charge Ze, where Z is an integer, and in this case Z is 
called th*e atomic number. If the atom has enough electrons 
to be electrically neutral, it is obvious that it must have Z elec- 
trons, so that the atomic number measures both the charge on 
the nucleus and the number of electrons in the neutral atom. 
We shall see that this number Z is the determining quantity in 
fixing the properties of the atoms; if all atoms are tabulated in 
order of their atomic numbers, they show periodic properties, 
for reasons which we shall discuss in the next chapter, and this 
arrangement is called the periodic table of the elements. Of 
course, the number of electrons on the atom does not always 
have to be just the atomic number; violent methods, as bombard- 
ment, can knock electrons off, or in some cases extra electrons 
can be added, producing positive or negative ions, respectively. 
We shall see that some elements, the electropositive or alkaline 
ones, have a tendency to lose electrons, and form positive ions, 
while the basic elements tend to gain electrons and become nega- 



tive ions. Atoms often enter chemical compounds as ions, rather 
than neutral atoms, so that in our study of atomic structure we 
shall have to speak constantly of ions as well as neutral atoms. 

The element of atomic number one is hydrogen, the simplest 
element. Its nucleus is an elementary particle, called the proton, 
with mass 1,846 times that of the electron. The heavier nuclei 
appear to be built up from a combination of protons and neutrons, 
particles of no charge, but of mass approximately equal to that 
of the proton. There are approximately equal numbers of 
protons and neutrons in any nucleus, making the atomic weight 
(the mass of the nucleus, in multiples of the mass of the proton) 
approximately twice the atomic number, though this rule is far 
from exact, the heavier atoms containing more neutrons in pro- 
portion than the light ones. The forces holding the nucleus 
together are presumably largely forces of attraction between 
protons and neutrons, more than counterbalancing the repulsions 
between protons on account of their like electric charges. By the 
action of these forces, stable structures are produced, disintegrat- 
ing only in the case of the heavy, radioactive elements, or in the 
very light elements under heavy bombardment. The theory of 
the structure of the nucleus is still in a preliminary state, and we 
shall not consider it; ordinary properties of matter prove to be 
almost completely independent of the nuclear structure, depend- 
ing only on its charge and mass, with most properties depending 
only on its charge, so that two nuclei of the same charge and 
different masses, called isotopes, exhibit almost identical prop- 
erties. Such isotopes are of very common occurrence, many 
ordinary elements being a mixture of several, the chemical atomic 
weights being weighted means of the weights of the isotopes, 
explaining why many observed atomic weights are far from whole 

241. The Structure of Hydrogen. — The simplest element is 
hydrogen, with but one electron moving about a single nucleus. 
Fortunately the problem of its structure, according to wave 
mechanics, can be exactly solved, and it serves as a model for the 
more complicated elements. In fact, we have already carried out 
many of the mathematical steps in problems at one time or 
another, so that we shall merely have to summarize results here. 
For generality, we shall treat not merely hydrogen, but the prob- 
lem of a single electron moving about a nucleus of charge Ze. 
The first thing we notice is that the nucleus is very heavy, com- 


pared with an electron. Now if we have a single electron and a 
single nucleus, exerting forces on each other, we find, in wave 
mechanics as in classical mechanics, that the center of gravity 
of the system remains fixed, each particle moving about the 
common center of gravity. But the center of gravity is very 
close to the nucleus; it divides the vector joining nucleus and 
electron in the ratio of 1:1,846. Thus the nucleus executes only 
very slight motions, and practically we can treat it as being fixed, 
and the electron as moving about a fixed center of attraction. 
We shall find that this is a very general method in discussing the 
structure of matter: we first assume all nuclei to be fixed, and 
discuss the motion of the electrons about them. Only later do 
we have to take the motions of the nuclei into account. We 
discuss this more in detail in a later chapter. 

We have, then, an electron of charge e, mass m, moving in a 
central field of force. The attractive force of the nucleus has a 
potential energy —(Ze 2 /r). Thus Schrodinger's equation, with 
the time eliminated, is 

Hu = [ — h-s— V —]u = Eu, (1) 

y Sir 2 m T J v ' 


We shall find it convenient in all our atomic problems to introduce 
at the outset so-called atomic units of distance and energy. The 
unit of distance is a = /i 2 /4x 2 we 2 , a unit first introduced in 
Bohr's theory of the hydrogen atom, but which comes into the 


present discussion as well. It is equal to 0.53 Angstrom. The 
unit of energy most convenient to use is 2ir 2 me 4 /h 2 ) though some- 
times a unit twice as great is used. This is the energy required 
to ionize a hydrogen atom from its normal state. It is most 
conveniently stated, not in ergs, but in volt-electrons. A volt- 
electron by definition is the energy an electron acquires in falling 
through a difference of potential of 1 volt, or eV = 4.774 X 10 -10 X 
•5"^ ergs. In terms of this, our fundamental unit of energy is 
13.54 volt-electrons. Associated with this energy is a frequency, 
given by energy = hv, and a wave length, and its reciprocal a 
wave number, given by 1/X = v/c. The wave number associated 
with our unit of energy is the so-called Rydberg number, R = 
109,737 per centimeter, and the corresponding energy is Rhc. 

In terms of our atomic units, Schrodinger's equation for hydro- 
gen can be rewritten, eliminating all the dimensional constants. 


Thus, if our new distances are the old ones divided by a , the 
new energy the old divided by Rhc, we easily find that 

(-v -??)« = £«, 


where the derivatives are to be taken with respect to the new 
x, y, z. The coefficient 2 in the potential energy appears in the 
process of changing variables, the potential energy of two elec- 
tronic charges being 2/r in these units. 

Schrodinger's equation can now be solved, in spherical coordi- 
nates, by separation of variables. Using the results of Chap. 
XV, Probs. 6 to 8, the equation can be separated, letting u = 
RQ&, and the differential equations are 

i d( . je\, r 
^ede{ sme d9) + [ 

, 2Z 1(1 + 1) 

sin dd\ dd J ' v ' J sin 2 

1(1 + 1) - m 

R = 0, 
6 = 0, 

|? + "> ! * = °. (3) 

The solutions of the second and third are 9 = Pf 1 (cos 6), $ = 

e ±iwl *, or cos m<f> or sin m<f> } where m must be an integer in order 

to have the function single-valued as far as <t> is concerned, and 

I must be an integer in order not to have the function P become 

infinite for cos = 1. The P's are called associated spherical 

harmonics, and are given by 

Pi m (cos 0) = sin lml 0(A o + Ai cos + A 2 cos 2 + • • • ), 

(fc + M - l)(Jc + [ml - 2) -1(1 + 1) f 

A k - A k ^— k(k _ 1} (4) 

For integral Vs, this series breaks off, the last nonvanishing term 
being for A; = I — \m\. For even I — \m\, the expansion is in even 
powers, and for odd I — \m\ in odd powers. The functions R are 
discussed in Prob. 3. We use a simple transformation of the 
dependent variable, y = rR. The equation in this variable is 

3f + [- + ?-«^>-a 


The solution is 

y = e- r ^^r l+1 (A Q + A ir + A 2 r 2 +••■), 

A - -2A Z-Q + k)V=E 

Ak " ZAk -\i + k)(l + k + l)-l(l + 1)' w 


This series breaks off if E = — Z 2 /n 2 , where n is an integer. 
A simple discussion shows that if it does not break off, the result- 
ing infinite series becomes infinite as r becomes infinite like 
e 2r ^~^ f so that y becomes infinite, and is not admissible as a 
wave function for a stationary state. We therefore limit our- 
selves to integral n's, and n is called the principal or total quan- 
tum number, determining the energy. In terms of it, we have 


y = e n r i+i( Ao + a i7 . + . . . + Ar^-i-ir"- 1 - 1 ), 
. _2Z n-l-k 

Ak n Ak ~\l + k)(l + k + 1) -1(1 + 1)' {J) 

From this recursion formula, we see that I cannot be greater than 
n — 1, in order to have any terms to the series; and from the 
earlier recursion formula for the function of 0, \m\ cannot be 
greater than I. The principal quantum number n, and the so- 
called azimuthal quantum number I must both be positive,* the 
smallest allowable value of n being 1 and of I zero. The so-called 
magnetic quantum number m, however, can be positive, negative, 
or zero, so long as its magnitude falls within the allowed limits. 
242. Discussion of the Function of r for Hydrogen. — Though 
we have an exact solution for hydrogen, a qualitative discussion 
is still desirable, using the method of energy. In Chap. VII we 
have already discussed motion in a central field in classical 
mechanics. We have seen that the motion along the radius is 
like a one-dimensional oscillation, in a potential field V + 
p 2 /2mr 2 , where V is the potential energy, p the angular momen- 
tum. In our case, the differential equation for y is like a one- 
dimensional wave mechanical problem with a potential, in 

. ■ .. * 2Z , 1(1 +1) . .. . Ze 2 
atomic units, ot (- - j! — ^ — -> or in ordinary units — 1- 

2 — ? where p = \/l(l + l)— It is thus clear in the first 

place that the quantum number I determines the angular momen- 
tum, in units of h/2ic, though the values are not I times this 
unit, but s/l(l + 1) times it. We shall further discuss the 
angular momentum later on. Now it is interesting to draw 

the various potentials, as we do in Fig. 66, where (- , 

2 k 2 
is plotted, for 1 = 0, 1, 2. We have also plotted 1 — ^ 


indicated by the dotted lines. The reason for this is that 
in Bohr's theory of hydrogen, it was assumed that the electron 
moved according to classical mechanics, and that its energy 
could have only those particular values for which the quantum 
conditions were fulfilled. He assumed that the angular momen- 
tum was kh/2T, where k was an integer, so that if we discuss 


1 2 

3 4 5 6 7 




— 11 N 

"I \ 
I I \ 

\^ ^--:_ 

1 ^ 
1 \ \ 

""""-— i 

r ~ . 


n ^ j 

__ — — 



i V — 

' — s \s 

,' /Vl 
/ / 


1 / 

1 / 


1 / 


-1 = 

Fig. 66. — Potential and energy levels for hydrogen. 

TT ,1V 2 _l_ W + 1 } 

Full lines: 1 5 — 

(potential corrected for centrifugal force, wave mechanics). Dotted lines: 

2 k 2 
1 — - (corrected potential, Bohr theory). Horizontal lines represent energy 

r r L 


the classical motion with these dotted potential curves, we shall 
have precisely Bohr's orbits. He also assumed 

tfPrdr = fy/2m (E - V — p 2 /2mr 2 ) dr = n r h, 

p = kh/2ir. 

The energy levels, either on Bohr's theory or wave mechanics, 
are — 1/n 2 , where on Bohr's theory n = k + n r , and these are 
drawn, at —1, — ■£-, — |, etc. Now consider the particular 
case k = 1. The lowest possible energy level for this is evidently 
— 1; for here E intersects the potential curve at but one point, 
giving, therefore, a circular orbit, the perihelion and aphelion 
distances being equal. As we see from the diagram, the radius 


of the circular orbit is one unit, and the energy minus one unit, 
explaining, therefore, the origin of the units. But for this same 
k, higher energy levels are connected with elliptical orbits, as, 
for example, that for which n = 2,. k = 1, with perihelion 
smaller, aphelion larger than the circle for n = 1. For n = 2 
there is a second Bohr orbit, for k = 2: a circle of radius 4 units. 
Similarly for n = 3, there are three orbits, for k = 1, 2, 3, 
and so on, the orbit f or k = n being in each case a circle. This 
question is discussed in a problem, where it is shown that the 

i • 7? J) 97^ 

orbits are ellipses, of semimajor axis equal to -^ -—z — - = -=-a , 

Z Airline* Z 

and minor axis equal to k/n times the major axis. 

In the wave mechanics, where the angular momentum has 
the nonintegral value y/l(l + 1) units, we must use the full 
lines. Now we are interested in the region where the kinetic 
energy is positive, not as the only place where motion can occur, 
but as the region where the wave function is sinusoidal. Out- 
side this region, it falls off exponentially. We can see a few 
examples in Fig. 67, in which the first few wave functions are 
plotted (we plot y, equal to r times the radial part of the wave 
function). On each function the limits of the region of classi- 
cal motion are determined by the fact that the points of 
inflection come here, the tendency of the curves being sinusoidal 
between the points, exponential outside. It is plain that the 
wave functions are larger where the electron is likely to be found, 
small where it is not, as we could prove by deriving the solution 
from the Wentzel-Kramers-Brillouin method, a possible, though 
not very convenient, method of discussing the hydrogen prob- 
lem. As this method would show at once, the wave length 
and amplitude both become large as r becomes large, and E — V 
becomes small, so that the outermost maximum of the wave 
function is in all cases the largest, and contributes most to the 
wave function as a whole. One property of the wave function 
is evident from Fig. 67: for small r, the behavior is determined 
mostly by I, for large r mostly by n. This is natural from the 

fact that for small r the quantity E + • % — - approaches 

2 — > and for large r it approaches E + - = ^-\ 

We note that as I becomes smaller and smaller, the region where 
the wave function is large, or the classical orbit, penetrates 



closer and closer to the nucleus. For large r, and, as a matter 
of fact, for the whole outer maximum, which, as we have seen, 
is the most important one, a fairly good approximation to the 


wave function is simply r n e n , the wave function for the orbit 
of maximum azimuthal quantum number (I = n — 1), corre- 
sponding to the circular orbit in Bohr's theory. It is interesting 

to note that this function has its maximum at r = -~ao, just 

the radius of the corresponding circular orbit in Bohr's theory. 
243. The Angular Momentum. — We have seen that the 

quantity y/l(l + 1)^- corresponds to the angular momentum of 

the orbit. This can be seen by computing the matrix of total 
angular momentum, or rather of its square, which is more con- 
venient. We can most easily get the operator for the angular 
momentum, in spherical coordinates, by an indirect method. 
Classically, H = p r 2 /2m + p 2 /2mr 2 + V, where p is the total 
angular momentum. Now in wave mechanics we find the wave 
equation such that 

1 h d/ 2 h d\ 
2mr 2 2-iri dr\ 2-wi dr) 

Lf-LAV- B~±\ 1 ( h V d 2 1 

nr 2 lsin d 2iri d0\ Sm 2-wi ddj + sin 2 0\27rc'/ d<j> 2 J 

h d\ 1 / h V „ , 

By comparison, it is plain that the operator for p 2 is 

But now from the differential equations for and $, we easily 
have, using this operator, 

p 2 u = 1(1 + l)(^) 2 w. (9) 

That is, p 2 has a diagonal matrix (since p 2 u is a constant times u, 
without any terms in other characteristic functions), and the 
diagonal value is Z(Z + \)Qi/2tt) 2 , so that the total angular momen- 
tum is constant, as it must be in the absence of torques. We 
can also easily find the component of angular momentum along 
the z axis. The angular momentum along this axis is the momen- 
tum conjugate to the angle <f> of rotation about the axis, so that 

h 8 
its operator is ^— : — -• Now take the solutions where d> enters 
Jiirl dq> 


into the wave function as the exponential, e ± 


Then p e u = 

-— : — = ±m 7r -u. This again is diagonal, showing that the 
2m d</> 2ir 

component of angular momentum remains constant. Further, 
if we use the wave function e iw4> , the component equals m h/2ir. 
The interpretation of these results is best made in terms of 
a vector model. Suppose we consider that the angular momen- 
tum of the orbit is I h/2ir. This will then be regarded as a vector,, 
normal to the plane of the orbit, pointing in some arbitrary 
direction in space. The component of angular momentum along 
the z axis is simply the projec- 
tion of the vector in that direc- 
tion. Now we find that this can 
have only the quantized values 
m h/2ir. Hence there are only 
a finite number of possible 
orientations for the orbit, as 
shown in Fig. 68, for the states 
for I — 3. Plainly m can go 
from a maximum of I to a mini- 
mum of — I, or 21 + 1 values in 
all, just as one finds from the 
discussion of the spherical 
harmonics. Now this vector 
diagram is only suggestive, not 
strictly true. We see this from 
the fact that our vector has 
length I h/2ir, while the actual angular momentum is \/l(l + 1) 
h/2w. The fundamental reason is that, since the angular momen- 
tum and its component are exactly given, the uncertainty prin- 
ciple does not allow us to fix definitely the plane of the orbit, 
which corresponds to a coordinate. As a matter of fact, the 
electron in wave mechanics does not move exactly in a plane, but 
strays outside the plane, as the uncertainty principle would 
suggest. This is best shown by polar diagrams of the spherical 
harmonics, plotting the square of the spherical harmonic, which 
gives the density, as function of angle. This is done in Fig. 69, 
for I = 1, m = 1 and 0, and I = 2, m = 2, 1, 0. (1 = does not 
depend on angle.) If we imagine these figures rotated about the 
axes, we see that for m = I, the figure indicates that most of the 


>. — Possible orientations of angu- 
lar momentum, for 1=3. 



density is in the plane normal to the axis, but considerable is 
out of the plane. For I = 2, m = 1, for instance, the density- 
lies near a cone, as if the plane of the orbit took up all directions 
whose normal made the proper angle with the axis. 

ra = + 1 m = m = ±2 m = + 1 m = 

1=1 1=1 1=2 1=2 1=2 

Fig. 69. — Dependence of wave functions on angle. O 2 plotted in polar diagram. 

244. Series and Selectio^i Principles. — All the states for a given 
value of I and n, but different m, have the same function of r, and 
the same energy. We shall find that this is still true with an 
arbitrary central field, so that even in that problem the solution 
is degenerate. Physically, so long as the angular momentum is 
determined, it cannot make any difference as far as the energy is 
concerned which way the orbit is orientated, on account of the 
spherical symmetry. Thus we often group together the various 
substates with the same I and n but different m, regarding them 
as constituting a single degenerate state, with a (2Z -)- 1) fold 
degeneracy. For hydrogen, the energy as a matter of fact 
depends only on n, so that all states of the same n but different 
I values are degenerate, but this is not true in general for a central 
field. It is convenient, rather, to group all the states of the same 
I value but different n together to form a series, since they are 
closely connected physically, having the same functions of angle, 
while those of the same n merely happen to have the same energy, 
but without important physical resemblances. The series of 
different I values are conventionally denoted by letters, derived 
from spectroscopy. We have the table as shown on page 417. 
By order of degeneracy we mean simply the number of sub- 
levels of different m values. 

The classification into series becomes important when we con- 
sider the transition probabilities from one level to another. We 


I value 



Order of 


Is, 2s, 3s, . . . 




2p, Sp, 4p, . . . 




3d, Ad, . . . 




4/, 5/, . . . 




5g, . . . 


recall that these are given by the matrix components of the 
electric moment between the states in question. When these 
components are computed, it is found that there are certain 
selection rules: 

1. The component is zero unless the Vb of the two states differ 
by ± 1 unit. 

2. The component is zero unless the m's differ by or ±1 unit. 
The latter rule is easily proved. For, suppose we compute the 

matrix components of x + iy, x — iy, z, which are simple com- 
binations of x, y, z, the three components of displacement. If 
we find the matrix components of all three of these to be zero for 
a given transition, the transition will be forbidden. Now these 
three quantities, in polar coordinates, are r sin e^, r sin e~ i4> , 
r cos 0, respectively. If u is RQe™*, we have (x + iy)u = rR 
sin 9 e i{m+1)4> , showing that this quantity has a matrix component 
only to states having the quantum number m + 1, since the 
quantity on the right could be expanded in series of functions with 
many values of n and I, but only the one value m + 1. Similarly 
(x - iy)u = rR sin e*'"^"*, allowing transitions only from 
m torn — 1, and zu = rR cos 6 e im *, allowing only transitions 
in which m does not change. The proof of the selection principle 
for I is slightly more difficult, involving the theorem that sin 
Pi m (cos0) or cos P z m (cos 0) can be expanded in spherical 
harmonics whose lower index is I + 1 or I — 1 only. 

The selection rules have the following results: If we arrange 
the series in order, spdf . . . , a level of one series can only have 
transitions to the immediately adjacent series. This gives us 
the transitions indicated in Fig. 70 (all of the transitions between 
upper states are not indicated; merely some of the more important 
ones down to lower states). The series of lines arising from 
transitions of the p states to Is is called the principal series; from 
the s terms to 2p, the sharp series; from the d terms to 2p, the 


diffuse series; from the / terms to 3d, the fundamental series. 
The letters s, p, d, f are the initials of these series. When the 
matrix components are worked out, the strongest lines are those 
in which I decreases by one unit (principal, diffuse, and funda- 
mental series), and those for which I increases (as the sharp 
series) are weaker. Of course, on account of the degeneracy in 
I in hydrogen, the different series are not separated, but they' are 
in other atoms, and it is for those that the classification is impor- 

Fig. 70. — Energy levels and allowed transitions and series in hydrogen. 

tant. To see this, we must study the energy levels in the general 
central field. 

245. The General Central Field. — We shall find that in 
discussing atomic structure, we shall wish to consider that each 
electron moves in a central field, but not an inverse square field. 
The field is rather the sort which we should have if there were a 
nucleus of charge Z units, surrounded by a spherical ball of 
negative charge, having a total charge — (Z — 1) units, corre- 
sponding to the remaining electrons of the atom. Such a field 

has a potential ■> where Z(r) goes from 1 at large r to Z at 


small r. For such a potential, most of our discussion goes through 
without alteration. The differential equation can be separated 
in the same way, and the functions of angle are just the same, 
so that our classification into series, vector model, and selection 
principles holds as with hydrogen. The only difference comes in 
the function of r, and in the values of the energy levels. We can 
no longer solve the equation exactly, and shall use the qualitative 
method of discussion. In Fig. 71 we show a diagram, like Fig. 

66, m which we plot — — — + — — i — -• The potential is so 

chosen that for r greater than unity, Z(r) is just unity, but for 

smaller r's Z(r) = 10 — 9r, so that the charge approaches 10 

at r = 0, but joins on smoothly at r = 1. It is obvious that the 

s electrons are greatly affected by the change in potential. The 

Is wave function is located practically all inside r = 1. Thus 

■ ii -2(10 - 9r) 2(10) , _, nx 
its potential curve is practically = — h 2(9) 

for the whole range. In other words, it is like a hydrogen prob- 
lem of nuclear charge 10 units, but with the constant correction 
2(9) to be added to the energy. The energy of such a state would 
be *- (10) 2 = —100, and when we add our constant 18, it is —82 
units, showing that this level is very tightly bound. Similarly 
the 2s is largely inside, though not so completely, and to a some- 
what poorer approximation its energy is — j — (- 18 = —7 units. 

The higher s orbits, however, project out into the region beyond 
r = 1, where the potential is hydrogen-like with charge 1, and we 
shall discuss them in a moment. The p, d, . . . states, on the 
other hand, are almost entirely outside the range where the 
potential is not hydrogen-like. Their energy levels and wave 
functions are almost exactly like those of hydrogen. 

It is seen from this discussion that we can divide the levels in 
such a case into three classes: (1) those entirely inside the range 
of large potential, which will prove to be those inside the atom; 
(2) those half in and half out; and (3) those entirely outside. 
The levels of larger I values do not penetrate the inside, and 
belong to group 3. In this case, we reach this situation with I — 
1, but with larger cores of negative charge about the nucleus, and 
so larger regions where the potential is much greater numerically 
than in hydrogen, the p electrons, or in some cases the d or even 
/ electrons, are penetrating. For the lowest I values, in any 


case, the orbits of large n are partly outside but penetrate inside, 
and those of small n are entirely inside the core of negative charge. 
These penetrating orbits have quite different energy values from 
the nonpenetrating ones, so that the different series do not lie 


-l >- 

Fig. 71. — Potential and energy levels for a central field, with Z(r) = 10 — 9r 
from r = to 1, Z(r) = 1 for r greater than unity. Left-hand diagram on 
different energy scale. 

on top of each other, as in hydrogen. For the orbits which 
penetrate in their inner parts only, we get a formula for the 
energy, from the quantum* condition. This formula is most 
conveniently derived using Bohr's form of the azimuthal quan- 
tum condition. We have fp r dr = n r h for the radial quantum 

condition. Then for hydrogen, I l p r dr -\- kh\= nh = — 7 — - h, 



where k is Bohr's azimuthal quantum number. Thus $p r dr = 

, ■ — kh. For a penetrating orbit with our form of potential, 

the integral over the outer part of the orbit, where the potential, 
and hence p r , are hydrogen-like, will have just the same value as 
here, if we use the proper energy. For the inside, however, p r 
is much greater, so that there is an additional contribution to the 
integral, as we see from Fig. 72. This contribution, moreover, 
is roughly the same for all terms of the same k value, since the 

Fig. 72. — Phase space and phase integral for r, penetrating and nonpenetrat- 
ing orbits. (1) and (2): Nonpenetrating orbits of same k, different n. (3) 
combined with (2) : Penetrating orbit, having same energy as (2) , but in a non- 
Coulomb field, so that it has a different quantum number and phase integral. 
Shaded area represents the quantum defect 5. 

inner part of the orbit depends almost entirely on the angular 
momentum alone. Thus we have for the general case 


p r dr = 



— kh + Sih, 

where 5i is a function of k only, to the first approximation. The 
result must be n r h, by the radial quantum condition, so that we 

E = - 

(n r + k - 5i) 2 (n - 5i) 2 



where n is the total quantum number, and where 5 is called the 
quantum defect. A more careful discussion, using the Wentzel- 
Kramers-Brillouin method, shows that the same formula still 
holds when we use -\/l(l + 1) in place of k, and remember that 
we must use half quantum numbers. This formula, which can 

be written, in wave numbers, E = —-, r~v5' is called Ryd- 

' (n — 5i) 2 

berg's formula, and was first discovered experimentally by Ryd- 
berg. We see then that the penetrating orbits fall into series as 
the nonpenetrating ones do, but that we must subtract the 
quantum defect from the quantum numbers. These quantum 
defects range from for the nonpenetrating orbits to sometimes 
quite large values, even of the order of 5 or 6, for the s electrons 
of heavy atoms. From experimental observations of spectral 
series, we can find the quantum defects, and so tell which orbits 
are penetrating, and which are not. In the next chapter we 
shall discuss in more detail the energy levels for the orbits entirely 
inside the atom, which are most directly concerned in atomic 

The wave functions for the central field of the type we are 
discussing are not very different in general from those for hydro- 
gen. But there are important differences in detail. We note 
that a hydrogen-like orbit corresponding to the problem of 
nuclear charge Z is 1/Z times as great as that for nuclear charge 
1. Hence, in the case of Fig. 71, the Is and 2s orbits are some- 
thing like 1/10 as large as for hydrogen. The penetrating orbits, 
like 3s, 4s, etc., will have the inner loops small in proportion, as 
the Is and 2s are, but the outer parts, being in a field of charge 1, 
will be large. Thus there will be a much greater disparity 
between the size of the inner and outer loops than even for hydro- 
gen, the outer ones being much more important in consequence. 
We may see this from the Wentzel-Kramers-Brillouin method. 
Here both amplitude and wave length go inversely with p r . In 
the penetrating part of the orbit, p r is much greater than for 
hydrogen, for the same total energy, so that amplitude and wave 
length become extremely small. The physical way to say this 
is that the electron moves very fast when it penetrates the core 
and is exposed to the whole charge of the nucleus, and hence 
spends but a very short time there, so that the wave function is 
small. For actually computing the wave functions, we can best 
use numerical integration of the differential equation, or the 


method of Wentzel, Kramers, and Brillouin. We shall discuss 
wave functions more in detail in the next chapter. 


1. Work out the spherical harmonics for I = 3, and draw diagrams for 
them similar to Fig. 69. 

2. Prove from the differential equation that the associated spherical 
harmonics are orthogonal. Verify this for the cases of I = 1 and 2. 

3. Carry out the solution of the radial wave function for hydrogen, deriv- 
ing Eqs. (5), (6), and (7), following the method outlined in the text, and 
verifying that if the series does not break off it represents a function which 
becomes infinite as r approaches infinity. 

4. Show that y 2 dr, where y = rR, is proportional to the probability of 
finding the electron between r and r + dr. Compute radial wave functions 
for states Is, 2s, 3s for hydrogen, and draw graphs of y 2 . 

5. Prove that for a radial wave function without nodes (I = n — 1), for 
nuclear charge Z, the maximum of y comes at n 2 /Z. 

6. Using the results of Prob. 3, Chap. IX, set up the radial phase integral 
for Bohr's model of hydrogen, showing that E = —1/n 2 . Using the prop- 
erties of the ellipse mentioned in Prob. 4, Chap. VII, verify the statements 
of Sec. 242 regarding the dimensions of the orbits. 

7. Draw an energy level diagram in which the substates of different m's 
are shown, drawing them as if slightly separated, including states Is, 2s, 
3s, 2p, 3p, 3d. Indicate all transitions allowed by the selection principles 
for I and m, as in Fig. 70. 

8. Prove that the potential used in Fig. 71 is what would be found with 
a nucleus of 10 units charge, surrounded at distance unity by a hollow sphere, 
with 9 units of negative charge uniformly distributed over the surface. 

9. A rough model of the inner electrons of the sodium atom can be 
obtained by assuming the nucleus of charge 11 units; a shell of radius 0.09 
units, with two electronic charges spread over the surface; and a shell of 
radius 0.58 units, with 8 electrons spread over it, so that the net charge is 
1 unit positive. Set up a diagram like Fig. 71 for such a potential field, 
drawing the potential functions for s, p, d electrons. Find which orbits 
are nonpenetrating. 

10. Using the potential of Fig. 71, and Bohr's azimuthal quantum condi- 
tion, compute the positions of 3s, 4s, and 5s levels. To do this, evaluate 
the radial quantum integral, computing separately the parts inside and 
outside r = 1, set the sum equal to n r h, and solve for the energy, using 
numerical methods if necessary to solve the transcendental equation. Find 
how closely the result fits with the Rydberg formula, computing quantum 
defects for each level. 

11. In the field of Fig. 71, the p electrons do not have exactly the hydrogen 
energies, for their wave function is not zero in the region inside r = 1, 
where the potential is not hydrogen-Uke. Compute the first-order perturbed 
value of the energies of 2p, 3y>, 4p, by using hydrogen wave functions as the 
starting point of a perturbation calculation, and assuming the difference 
between the hydrogen potential and the actual one as perturbative potential. 


Compute quantum defects for each level, seeing how well the Rydberg 
formula is obeyed. It is to be noted that in such a case as this, the second- 
order perturbation is often more important than the first, so that our calcula- 
tion is not very accurate. 

12. Apply the Wentzel-Kramers-Brillouin method to the wave functions 
of hydrogen, computing approximate radial functions for 3p, 4p, and com- 
paring with the exact solutions. 



The electrons in an atom move, to an approximation, in central 
fields of force, each in the field produced by the nucleus and the 
average charge of the other electrons. Thus, as we have seen 
in the last chapter, there are different quantum numbers which 
they can have. We can have in an atom Is, 2s, 2p, . . . elec- 
trons. All electrons of a given total quantum number, inside the 
atom, have roughly the same radius for the maximum of their 
wave functions, and roughly the same energy, in contrast to the 
electrons which are largely outside, in which s and p electrons 
are more tightly bound on account of penetration. We can then 
group the electrons of the same total quantum number together 
into shells, those of n = 1 forming what is called the K shell, 
those with n = 2 the L shell, n = 3 the M shell, etc., the letters 
K,L,M, . . . coming from x-ray notation. The inner electrons 
are the most tightly bound and hardest to remove, and hence 
connected with the highest frequencies in the spectrum: the K 
series of x-rays, connected with the electrons of the K shell, has 
shortest wave length, L series next, and so on. On the other 
hand, an outer electron is shielded from the nuclear attraction 
by the presence of the other electrons; for the electrical force 
acting on a charge in a spherical distribution is what we should 
have if we imagined a sphere drawn about the center through the 
charge in question, forgot about all charge outside this sphere, 
imagined the charge inside the sphere concentrated at the center, 
and calculated its attraction by the inverse square law. Thus 
lor an inner electron we forget about almost all the other elec- 
trons and have practically the unadulterated attraction of the 
nucleus, but with an outer electron the number of other electrons 
within the sphere is almost equal to the nuclear charge and almost 
cancels it, leaving only a small net attraction, and an easily 
detached electron. It is convenient in this connection to speak 
of an "effective nuclear charge" Z e , and a shielding constant S; 
?>~ is the charge which, placed at the center, would produce the 



same attraction as the nucleus and electrons, and thus varies 
from Z for the inner electrons down to the order of magnitude of 
1 for the outer ones, and S is denned by Z e = Z - S, so that S 
measures roughly the number of electrons inside the sphere in 
question. In general, we see that each electron in an atom, or 
at least each shell, will have a different shielding constant. And 
now it is an important fact that the energies involved in ordinary 
chemical and physical processes are only large enough to remove 
or disturb the outer electrons of an atom, and leave the inner 
ones unaffected. Only x-rays, very violent bombardment, and 
such extreme means can disturb the inner electrons, and as a 
result we need not consider them in ordinary chemical and physi- 
cal applications. 

246. The Periodic Table.— The series K,L, M, . . . of shells 
has no obvious end, and yet an atom has but a finite number of 
electrons. It is evident, then, that the shells cannot all be filled. 
The attraction of the nucleus will pull electrons into the lowest 
shells, until they are filled, and then the rest will have to go into 
higher ones. The capacity of a shell is strictly limited, according 
to a very important principle called the exclusion principle 
(excluding more than a certain number of electrons from a shell), 
so that a K shell can contain only 2 electrons, an L shell 8, an 
M shell 18, an N shell 32, and so on. Using this principle, we 
can begin to see how the atoms build up, and in so doing we under- 
stand the structure of the periodic table (see Fig. 73), the fact 
that when atoms are tabulated according to atomic number their 
properties repeat themselves in a regular way. Thus hydrogen 
has but one electron, which naturally prefers to go into the K 
shell. Of course, it does not have to; it can be in a higher shell, 
or level, corresponding to a higher energy, and then it is an 
excited electron. But this is not a stable situation: collision with 
another atom or molecule, or interaction with radiation, is most 
likely to absorb the extra energy and permit the atom to fall to 
its lowest and most stable energy level, losing its excitation, so 
that this lowest level is the normal state. This situation of the 
existence of excited states, but the preference for the normal 
state, is characteristic of all the atoms, and for the moment we 
are describing the normal states, for they are the ones in which 
we ordinarily find the atoms. 

To resume, helium has two electrons, and in the normal state 
they are both in the K shell. This shell is now completed, no 



more electrons can be bound in it, and such a completed shell is 
characteristic of the inert gases, of which helium is one. Lithium, 
with three elctrons, would have two K electrons, and one L, and 
the latter would be loosely bound, and could be easily detached. 
In connection with this, we observe that lithium is an alkali 
metal, very much inclined to form a singly charged positive ion, 
which it does by losing the one electron, the loss of unit negative 



- Ra 

Ce Pr Nd-SmEuGdTbDyHoErTuYbCb 

57 58 59 60 61 62 63 64 65 66 67 686970?! 





7273-74757677 78 

AuHfl TIPbBiPb 

7990 81 82 836485 

Q. Q.CL Q. O. 
W W "llO "w "« 

gg 6s,4f.5eJ,6p 


w w to 8 S W W 

Fig. 73. — Periodic table of the elements, with electron configuration of lowest 


charge being the same as gaining unit positive charge. Next, 
beryllium with four electrons has two K'& and two L's, and can 
easily lose the latter to form a divalent positive ion. Thus we 
go through boron with two K's and three L's, carbon with four 
L's (forming sometimes the ion with four positive charges) and 
nitrogen with five L's. By this time, however, the attractions 
between the outer electrons and the nucleus have become rather 
large, and they are not easy to detach. The reason for this is 
that as we get more electrons in a shell, the effective nuclear 


charge gets larger. For the electrons in a shell cannot shield 
each other very effectively; off hand we cannot say whether they 
are inside or outside the sphere of the last paragraph, and as a 
matter of fact the contribution to the shielding constant made 
by an electron in the same shell we are considering is only about 
0.35 of an electronic unit. Thus if the effective nuclear charge 
for lithium's L electron were 1.30 (which is about the right 
amount, equal to Z — S where Z = 3, S = 1.70 for the two K 
electrons, which do not shield perfectly), then for one of the two L 
electrons in beryllium we should have 4.00 — 1.70 — 0.35 = 1.95, 
and for an L electron in boron 5.00 - 1.70 - 0.70 = 2.60, increas- 
ing 0.65 for each atom, until for nitrogen we have 3.90 and for 
oxygen 4.55. Since the electrostatic attractions are proportional to 
the nuclear charge, this means that it is much harder to remove an 
electron from nitrogen than from lithium. By the time we come 
to oxygen and fluorine, we hardly have positive ions formed at 
all. But now another situation comes in : the attractions become 
so strong that an atom can pull an extra electron or two into its 
outer shell, forming a negative ion. Thus oxygen very easily 
forms a singly charged negative ion, and sometimes a doubly 
charged one. It can not go farther than this, for with two extra 
electrons its L shell has eight electrons and is completed. Simi- 
larly, fluorine can form a singly charged negative ion, but no 
more. And finally neon, with ten electrons, has two K's, eight 
Us, and consists of closed shells. It is the next inert gas after 
helium. It forms no ions : it would have to hold an extra electron 
in the M shell, and this would not be tightly bound, so that it 
would not stay; or to form a positive ion, it would have to lose 
one of its L electrons, and these are held too tightly to be removed 
by ordinary chemical processes. Thus it is inert. 

After neon, we next come to sodium, with eleven electrons. 
This has two K's, eight L's, and the next electron must be an M . 
That is, it has one loosely bound electron, just like lithium. It 
again has a tendency to form a singly charged positive ion, and 
is an alkali metal like lithium. Magnesium, next, has two M 
electrons, and is like beryllium. We begin here to see the origin 
of the periodic table, for we have advanced by eight in our series 
of elements and have come to elements of similar properties. 
The similarity persists in this way up through argon, with eigh- 
teen electrons. A that point, we must take account of a further 
fact which we have not mentioned. Each of these shells is 


really subdivided into subshells, of slightly different size and 
energy. The subshells are determined by the azimuthal quantum 
numbers, the states s, p, d, . . . of the same total quantum 
number becoming less tightly bound as we go out in the series, 
on account of decreased penetration. The maximum number cf 
electrons in a shell of a given designation is invariable : an s group 
can have only 2 electrons, a p group 6, a d group 10, an / group 
14, and so on (2 X 1, 2 X 3, 2 X 5, 2 X 7, • • • , or in general 
2 X the number of subgroups of different m values). Now the 
K shell contains only the s group, accounting, therefore, for its 
maximum number 2 of electrons. The L shell contains a 2s and 
a 2p group, so that its maximum number is 2 + 6 = 8. Simi- 
larly the M has subshells 3s, 3p, 3d, with a maximum number 
2 + 6 + 10 = 18, and N has 4s, 4p, Ad, 4/, with a possibility of 
2 + 6 + 10 + 14 = 32 electrons. When now we examine the 
energies of these various groups, we discover that the differences 
of energy between different subgroups of a shell may often be 
larger than those between different shells, with a result that the 
order of groups is changed. As a matter of fact, beginning with 
the most tightly bound shells, the groups are arranged as far as 
their energy is concerned approximately as shown in the following 
table, in which the first line gives the group, the second the num- 
ber of electrons in the group, the third the total number of elec- 
trons in that group and all inside it, and the last the element 
completing the group, whose atomic number therefore stands 
just above it: 

Is, 2s, 















2 . 2 















2 4 















He Be 















Within each shell the subshells are arranged in the order stated, 
but there is overlapping between the shells. 

We now see that at A (argon), although the M shell is not' com- 
pleted, still the 3p subshell is, and this is enough to form a closed 
group and an inert gas. Next we come to K, 19, with one 4s 
electron, another alkali, and Ca, 20, with two, an alkaline earth 
like Be and Mg. But now instead of forming a group of 8 by 
adding p electrons, the next additions go 'nto the 3d shell, and 
only after that is filled up do they go into 4p, so that by the time 
we come to the next inert gas, Kr, we have added 18 electrons 
rather than 8 after A. The series of elements in which the 2d 


electrons are being added is the iron group. These have con- 
siderable similarity, because although the 3d electrons are less 
tightly bound than the 4s, they are farther inside the atom, and 
the outside parts of these atoms are quite similar. When we go 
beyond Kr, we repeat the same sort of process, having another 
group of 18 elements in which the 5s, 4d, and 5p electrons are 
being added, before coming to the next inert gas Xe. The 
transition group which we go through here is the Pd group. 
Next after that, after adding the two 6s electrons to form Ba, 
the whole group of 14 4/ electrons is added, resulting in a long 
group of remarkably similar elements, the rare earths. As a 
matter of fact, these elements have one 5d electron each, so that 
our scheme is a little misleading in respect to them. After 
finishing the 4/ group, the normal procedure repeats itself, the 
bd and Qp being added to complete the shell of 18 interrupted by 
the rare earths and terminated at Rn, and finally the 7-quantum 
electrons being added to give the elements of the last, incom- 
pleted row of the table. 

It is often convenient, in describing an atom in any state, to 
give the number of electrons having each quantum number by a 
symbol, as ls 2 2s 2 2p 6 3s for the normal state of Na, meaning that 
there are two Is, two 2s, six 2p, one 3s electron. Such an arrange- 
ment is called a configuration. And a transition between two 
stationary states can be conveniently denoted by writing the 
two configurations. Thus the transition ls 2 2s 2 2pHp — > ls 2 2s 2 2p 6 3s 
for Na is a line of the principal series in the optical spectrum; 
the transition Na ls 2 2s 2 2p 6 3s — > Na+ ls2s 2 2p 6 3s represents the proc- 
ess of ionizing one of the K electrons of Na; and so on. 

247. The Method of Self -consistent Fields. — We have just 
seen that the electrons of an atom act approximately as if they 
moved in central fields, rather than under the action of the other 
electrons, and have shown that this leads to quantum numbers 
for the electrons, to shells resulting from this, and to the periodic 
properties of the elements as successive shells are filled up. In 
making this idea more precise, we meet the method of self-con- 
sistent fields, developed by Hartree. In this method we assume 

1. The field in which the kth electron moves is obtained by 
taking the wave function of each of the other electrons, squaring 
to get the average density of charge due to these electrons, averag- 
ing over angles to get a spherically symmetrical distribution, 


adding all these charge densities together, and finding the poten- 
tial, together with that of the nucleus, by electrostatics. This, 
of course, will give a nonhydrogenic field, different for each 

2. To get the wave function of the fcth electron, we solve 
Schrodinger's equation for the field above, using the appropriate 
quantum numbers. Since the field is nonhydrogenic, we must 
use numerical methods, or the Wentzel-Kramers-Brillouin 

Having found these final wave functions, they must be the 
same ones with which we started step 1. It is this fact which 
leads to the name " self-consistent." If we started with arbitrary 
wave functions, computed a field, solved for the wave functions 
in that field, the final functions would not in general agree with 
the original ones. If we keep on repeating the process, however, 
using in each case the final wave functions of one stage of the 
calculation to begin the next, it rapidly converges so that after 
a few repetitions the field is approximately self-consistent. 
This method has been used for numerical computation of the 
wave functions of a number of atoms. 

248. Effective Nuclear Charges. — The method of self- 
consistent fields, though quite accurate, demands numerical 
computation, and is not well suited for elementary calculations. 
We may instead approximate the wave function of each electron 
by a hydrogen wave function, corresponding to an effective 
nuclear charge Z — Si. To get Si, we should add up the total 
number of electrons within a sphere whose radius is the effective 
radius of the ith. electron's wave function. It is easier to figure, 
not by means of the radius, but from the quantum number, since 
to a rough approximation the radius of an orbit is n?/{Z — Si), 
so that electrons inside a given one are those of smaller total 
quantum number. The following table proves to give roughly 
the contribution to the shielding constant of a given electron 
from each other type of electron, valid for the electrons found 
in the light atoms. We see that the shielding of one electron 
by a second does not go suddenly from unity to zero as the 
shielding electron's quantum number becomes greater than 
that of the shielded electron, but instead changes gradually, 
in accordance with the fact that each electron really has charge 
distributed over all distances, and it is possible for part of the 
charge to be inside, part outside, a given radius. 



Table 1. — Contribution of One Shielding Electron, of Given 
Quantum Number, to Shielding Constant of Shielded Electron 

Shielding electron 

Shielded electron 




























To illustrate the use of this table, let us take the case of 
Na, Z = 11, in its normal state ls 2 2s 2 2p 6 3s. Evidently we 
have three shells, corresponding to the three values of n. Then 
we have 

n = 1: 8 = 0.35, radius = n 2 /(Z - S) = 1/10.65 = 0.09 

n = 2: S = 2(0.85) + 7(0.35) = 4.15, radius = 4/6.85 = 0.58 

n = 3: S = 2 + 8(0.85) = 8.80, radius = 9/2.20 = 4.09. 

The inner radii are as given in Prob. 9, Chap. XXXIII. 

The calculations we have given so far refer to wave functions, 
rather than energy levels. To investigate the latter, we must 
make a more careful discussion of the theory of the many-body 
problem and its treatment by Schrodinger's equation. 

249. The Many-body Problem in Wave Mechanics. — Our 
treatment of atomic structure so far has been rather intuitive, 
not based directly on Schrodinger's equation at all. We have 
not yet set up the problem of many bodies in wave mechanics. 
To do so, we proceed as follows: Let the problem have N gen- 
eralized coordinates, q\ . . , q N . Then we seek a wave function 
>K<?i . . . qN, t), such that ^dq\ . . . dq N gives the probability 
that the coordinates will be found at time t in the region dq x . . . 
dq N . To set up Schrodinger's equation, we take the classical 
Hamiltonian function, convert it into an operator H by sub- 
stituting -p—. — for p iy and write the equation H\p = —tt—. -r-- 
2ti dqi 2ti at 

We eliminate time as usual, and have a differential equation 
for u{qi . . . q N ), which is Hu = Eu, E being the energy of the 
whole system. 

There is one simple case of the many-body problem: that 
where there are many particles, exerting no forces on each other. 


That is, we may have n particles, whose coordinates are x x y\Z\ 

. . . x n y n z n , and the potential is 7= Vi(xiyiZi) + • • • + 

V n (x n y n Zn), without any terms involving coordinates of two parti- 

dV dV- 
cles simultaneously. For such a potential, — = -^(xiyiZi), 

OXi OXi 

a force on the ith particle depending only on the coordinates 
of that particle. In such a case, we can separate variables, 
writing u = Ui(xit/iZi) • • • u n (x n y n Zn). For Schrodinger's equa- 
tion can be written 

[(- afe*' + v ) + ■ ■ ■ + (- s^" 2 + v -)} = 

Eu, (1) 

where V; 2 means d 2 /dx t 2 + d 2 /dyt 2 -f- d 2 /dZ{ 2 . A separation of 
variables can be carried through in the usual way, and can be 
summarized as follows: if we write u as a product, as above, 
then Schrodinger's equation is satisfied if 

(-&fe V <* + V ) Ui = EiUi 

't»i) (2) 

Ex + • • • + E n = E 

In the case of atomic structure, and in general with the struc- 
ture of matter, there are forces between the electrons. But 
here it is possible to make an* approximation, as we have done: 
we replace the actual force between a given electron, say the ith, 
and the others, by the average which it would have from the 
mean distributions of the other electrons in space. Roughly 
we may say that, while the force with any particular arrangement 
of the other electrons will differ from this value, it will average 
out to give our mean value, and the deviations from the mean 
will not be so large as to destroy the approximation. Thus, 
using such a method, each electron becomes acted on, not by 
the other electrons, but by an averaged field. It is the motion 
in this field that we have considered in the present chapter. 

250. Schrodinger's Equation and Effective Nuclear Charges. — 
The result of the approximate calculation we have made has 
been a set of one-electron wave functions, one for each electron 
of the atom. These satisfy equations which, in atomic units, 

^, 2(Z - Si) 


(Z - Si) 2 
Ui = — j-^-Ui. (3) 


Now the potential energy of the whole atom, in atomic units, is 

all pairs 

if m is the distance between the ith and ith electrons. Thus the 
Hamiltonian is 

*-2(-"-*+2£+2£\ (« 

i \ j inside i j in same / 

\ shell as i / 

where the two summations are the same thing as the sum over 
all pairs. If now we assume that u = u x • • • u n , where the u'a 
are as we have found, and try to see how good an approximation 
this forms, we have, substituting for the Laplacians, from Eq. (3), 

*--[-2^V2(-* + 



^ rj 

+ > f-k (6) 

j inside i j in same 

• shell as i 

If Schrodinger's equation were satisfied, this would be Eu, where 
E is a constant. This is not true; the first term is a constant 
times u, but the second is a variable function of the r's times u. 
The average value of the last term, however, is approximately 
zero. For 2/r*,- is the potential, at the *'th electron, of the jth 
electron. If the latter is inside, and we average over its position, 
and average to make it spherically symmetrical, the potential 
will be the same as if it were concentrated at the center, or will 
be 2/n. For an electron in the same shell, it turns out that the 
average of l/r»,- is about 2(0.35) /r». The summation, for all 
electrons inside or in the same shell as i, is then essentially 
2Si/r i} just canceling the first term, and leaving as the result, 
using this approximate method «of averaging, of 


showing that we have an approximate solution, and that the 
energy of the atom is - ^ ^— This represents the nega- 


tive of the energy required to remove all the electrons from the 
atom. If we wish to find the energy of the atom by first-order 
perturbation theory, we recall that we must find the diagonal 
term of the energy matrix, or juHu dv. This means averaging 
the energy over the wave function, or over the motions of the 
electrons; and to the same approximation we have just used, 
the summations average to zero, leaving the same energy we just 

As an example of the calculation of energy, we can again 
take the case of Na. The energy of normal Na is, using the 

2( —\ — ) + 8( — ~ ) 


= —321.4 units. With one Is electron removed, 
making the appropriate changes in shielding constants, the energy 
to -|7!y»Y + 8( 7 4°Y + (mX] = -240.6unit, Thedif- 

[(^y + s(-y +(*#•)] 

ference is 80.8 units, or 1,094 volt-electrons, representing the 
ionization potential. Similarly with the 2s removed, the energy 

K^M^y + m}= 

is -^2\^—J + 7^^ ) + I ^p I J = -318.6, leaving an 

ionization potential of 2.8 units, or about 38 volt-electrons. 
Finally the ionization potential of the 3s, as we immediately 

(¥*)' - • 

see, is simply I ~^— J = 0.54 unit = 7.3 volt-electrons. 

251. Ionization Potentials and One-electron , Energies. — In 
the method of self-consistent fields, each electronic wave function 
is the solution of a central field problem, for a single electron. 
This one-electron problem has a certain energy, as found in 
the preceding chapter, always negative, very large numerically 
if the electron is tightly bound, smaller if it is more loosely bound) 
and it is natural to ask for the interpretation of this energy. 
The connection with tightness of binding suggests directly 
that the one-electron energies measure the work required to 
remove the electron in question, or the ionization potential, the 
negative energies being the negative of the ionization potentials. 
This proves in fact to be the case. One can compute these 
ionization potentials, by finding the energies of the atom and 
ion and subtracting, and the result proves to be, to the first 
order of perturbation, just the one-electron energy. Thus the K 



ionization potential is given by the distance of the Is energy 
level below zero in the corresponding one-electron problem, and 
so on. The connection is not very accurate, but it is close enough 
to be very useful. 

Our method of effective nuclear charges, being an approxima- 
tion to the method of self-consistent fields, should show the same 
property, and we can give a simple though not entirely satis- 
factory proof. ' The negative of the ionization potential is 
the energy of the atom, minus the energy of the ion. If the ith 
electron is to be removed, and if S,- represents a shielding con- 
stant in the atom, S/ for the ion, then the energy of the atom, 
minus the energy of the ion, is 

3 39*1 

If we set S/ = Sj — (Sj — S/), and expand, this is 

(Z - Stf 
tii 2 

^ (z - Si)> + 2(z - Sj)(s, - s/) + (Sj - s/y - (z - s^y 

^J Uj 2 

39* i 

Our simple proof holds only in case there is no other electron 
in the same shell as the ith, and if we assume that each electron 
shields by either 1 or 0. Then we have S, — S/ = if the jih 
electron is inside the *th, 1 if the jth. is outside the ith. Thus 
for the ionization energy we have 

(Z - S^ 2 | ^g 2(Z - Sj + j) (g) 

j outside i 

In this case we can easily find the energy of the one-electron 
problem. The potential energy of the field in which the larger 
part of the ith wave function is located is 

-*£=M+ y, i ( 9) 

j outside i 

where Si represents the number of electrons inside the ith, and 
the summation is for all outer electrons, assuming constant mean 
radii, which are approximated l/r } - = (Z. — $,-)/n,- 2 . To verify 
the correctness of this potential^ we note that the corresponding 
force, — [2(Z — Si)]/r 2 , is what we should have for the charge 


inside the sphere concentrated at the nucleus, and the constant 
terms of the summation are added to make the potential con- 
tinuous at the outer shells. Thus the wave equation for Ui is 

-V, 2 - 


j outeids i 

(Z - S*) 


which gives immediately 

1 outside i 

or, using the value of l/r,- f 

j outside i 

agreeing with the value (8) already found, except that the correc- 
tion term *^ in (Z - £,- + }4) is missing. This formula, more- 
over, is interesting, in that it shows that the shielding has 
two effects on the energy: (1) The energy has the term 
— (Z — S?)/n£ instead of — Z 2 /n 2 , as we should have with an elec- 
tron in the unshielded field of the nucleus. This effect, reducing 
the magnitude of the ionization potential, is called the inner shield- 
ing, since it comes from the inner electrons. (2) There is also the 
summation over the outer electrons, likewise resulting in a 
reduction of ionization potential, and called the outer shielding. 
As we see from our derivation, the outer shielding results from 
the rearrangement of shielding constants of the outer electrons 
when an inner electron is removed. 


1. The K series in the x-ray spectra comes when a K electron is knocked 
out, and an L, M , . . . electron falls into the vacant place in the K shell. 
The lines are K a (if an L electron falls in), Kp (an M), etc. Write down the 
configurations before and after the K a and Kp transitions of Mo. 

2. Show that the frequencies of the lines of the K series are less than the 
frequency of light necessary to cause ionization of the K electron. Compute 
the K ionization potential and the K a line for Ca, and show that they fit in 
with the general case. 

3. Moseley's law is that the square roots of x-ray term values (ionization 
potentials) form a linear function of the atomic number. This would obvi- 
ously be true if there were just inner shielding, for then the square root 
would be simply (Z — S)/n. Investigate how closely this is true when there 


is outer shielding as well, computing K and L term values for electrons from 
Z = 10 to Z = 20, and seeing how closely the square roots fall on straight 

4. Iso-electronic sequences are sets of ions, all of the same number of 
electrons, but with different nuclear charges, and hence different degrees of 
ionization. Compute the ionization potentials, or term values, ls 2 2s 2 2p — > 
ls 2 2s 2 , ls 2 2s 2 3s — > ls 2 2s 2 , for the atoms Z = 5 to 10, indicating what ions 
they are (as Z. = 6, ls 2 2s 2 2p is C + ). Investigate to see whether these term 
values follow Moseley's law that the square root of the term value is a linear 
function of atomic number. 

5. Using the approximation that the radius of a shell is n 2 /(Z — S), draw 
curves giving the radius of each shell as function of Z for all atoms up to 
Z = 20 (compute only enough values to draw the curves). 

6. In a closed shell of p electrons, there are two electrons of m = 1, two of 
m = 0, two of m = —1. Using the spherical harmonics for these cases, 
compute the squares of the wave functions, treating these as electron densi- 
ties. Add the densities of all electrons, showing that the sum is independent 
of angle, or that the p shell is spherically symmetrical. The same thing is 
also true of any completed shell. 

7. Given a spherical distribution of charge, where the potential is 2Z P /r, 
and the force 2Z//r 2 , where Z p , Z/ are both functions of r, prove that Zf = 
Z p - r(dZ p /dr). * 

8. Assuming that the electrons are located on the surfaces of spheres of 
radius n 2 /(Z — S), find and plot Z/ and Z v for Na + as functions of r. 


Atoms by themselves have only a few interesting properties: 
their spectra, their dielectric and magnetic properties, hardly any 
others. It is when they come into combination with each other 
that problems of real physical and chemical interest arise. Atoms 
act <on each other with forces, in some cases attractive and in 
others repulsive, and in this chapter we shall consider the nature 
of these forces, how they arise, and what their results are in 
their effect on the physical and chemical structure of the sub- 
stance. Interatomic forces in the first place hold atoms together 
to form molecules ; this forms the province of chemistry. But in 
turri they hold molecules together into their various states of 
aggregation, as solids, liquids, and gases, and this is ordinarily 
considered to be part of physics. The distinction, however, is 
purely arbitrary, and not at all general. We shall begin by dis- 
cussing the most important types of force, with a little considera- 
tion of the types of substances in which they are found. All the 
interatomic forces of interest in the structure of matter are 
electrical or in some cases magnetic; the only other forces, gravi- 
tational, are far too small to be of significance. We arrange 
the different types according to the way they depend on the 
distance of separation of the atoms. 

252. Ionic Forces. — If two atoms are ionized, they attract or 
repel according to the inverse square. If the net charge on one is 
Z\ units, on the other z 2 , the potential energy between them is 
2i2 2 e 2 /r, if r is the distance between. 

253. Polarization Force. — Atoms are polarizable, as we have 
seen in discussing refractive index, in Sec. 172, Chap. XXIV. 
That is, an atom in an electric field E acquires an electric moment 
olE. Now suppose that we have an atom or an ion in the pres^- 
ence of another ion. The ion produces a field ze/r 2 . This in 
turn polarizes the first atom or ion, producing a moment aze/r 2 . 
The resulting dipole reacts back on the ion, attracting it with a 
force equal to the field of the dipole (equal to the moment of the 
" • 439 


dipole times 2/r 3 ) times the charge on the ion, or -(2az 2 e 2 /r 5 ). 
The potential of this force is — (az 2 e 2 /2r i ), giving always an 

254. Van der Waals' Force— Ionic and polarization forces 
are met only with ions. The forces observable at largest dis- 
tances between neutral atoms or molecules, and hence of impor- 
tance in the behavior of liquids and imperfect gases, are called 
Van der Waals' jgprces, on account of their appearance in Van 
der Waals' equation of state for imperfect gases. They arise as 
follows. An atom is generally spherically symmetrical and thus 
on the average has no externa] electric field. But this is only 
on the average; instantaneously it is not spherical, but the elec- 
trons are at arbitrary positions, and the result gives a dipole 
moment, averaging to zero, but instantaneously different from 
zero. This dipole polarizes a second atom or molecule. Thus 
the field of a dipole of moment n is jic/r 3 X function of angle. 
In the two special cases where the dipole points straight toward, 
or away from, the atom, the function of angle has the values 
+ 2, respectively. In that case, the induced dipole in the second 
molecule is + (2a/z/r 8 ). This produces a field back on the first, 
equal to ± (4a^/r 6 ). The force by which it acts on the original 
dipole is equal to the rate of change of the field with r, times the 
dipole moment, times a function of angle which is ± 1 in the two 

cases considered, or ( +-^\±ii) - -— ^-> with potential 

4cm 2 
energy g — If we had considered all angles, we should have 

got a different constant, but in any case an attraction, —con- 
stant X a/j, 2 /r 6 . 

To calculate the polarization and Van der Waals' forces, we 
should have to find a and /*. The calculations for these are 
difficult and will not be attempted here, though, a derivation 
will be given in a later chapter. For the present, however, we 
can get some semiempirical formulas which will serve for rough 
calculations. First, the polarizability a has the dimensions of 
a volume. An argument from a simple model in Sec. 172 showed 
that, at least in order of magnitude, the polarizability of a spheri- 
cal atom is equal to the cube of its radius. Now the radius of 
an electron's orbit can be approximated by n 2 a /(Z — S), so 
that we might imagine that the polarizability of an atom could 
be approximated by the sum of such terms, cubed, for all elec- 


trons. Empirically, one finds that this gives about the right 
dependence on Z, but not very accurately for n: the contribution 
of an electron to the polarizability proves to be approximately 

/ n 2 a V 

(4.5 if n = 1 
X <1.1 if n = 2 
(0.65 if n = 3, etc. 

The total polarizability is the sum of such contributions, for all 
electrons. As we readily see, only the electrons in the outer shell 
make an appreciable contribution, since they have the largest 
values of n and the smallest Z's. Hence we may simply multiply 
the number v of electrons in this shell by the term above. Thus 
v = 2 for an ion with the same structure as the He atom, 8 for 
one built like Ne, 8 for one like A, etc. 

For the Van der Waals' force, we expect an energy —con- 
stant X an 2 /r 6 . We shall consider the problem more in detail 
in Chap. XLII, Sec. 301, where it is shown that the energy is 

3 1 2 • 

~2 ^ ' 

and where in addition we have the relation 

, a&E 
11 ~2~' 

In this formula, AE is the difference of energy of that transition 
from the normal state which contributes most to the refractive 
index and dispersion. Ordinarily this can be taken to be the same 
as the ionization potential of the atom. Thus, since we know how 
to find ionization potentials from our effective nuclear charges, 
we may use empirical or approximately calculated polariza- 
bilities to get coefficients for the Van der Waals' attraction. 

The three types of force we have enumerated all fall off as 
inverse powers of the distance. If we inquire further, we find 
that there is a whole series of terms, in higher and higher inverse 
powers of r. Thus between ions we have a series commencing 
with terms in 1/r and 1/r 4 , between atoms commencing in 1/r 6 , 
but having higher terms arising from interaction of the induced 
dipoles of both atoms with each other, interaction of dipoles and 
charges with quadrupole moments, etc. The complete series 
would be difficult to evaluate. In addition to these forces, there 
are other quite different ones, coming when the atoms are so 
close that their charge distributions actually begin to overlap. 


Since these distributions fall off exponentially with distance, 
as in hydrogen functions, these types of force all fall off exponenti- 
ally and for that reason cannot be expanded in inverse powers 
of r at all (the exponential function possesses a singularity at 
infinity and so cannot be expanded in power series in 1/r). The 
forces are sometimes grouped together, but we prefer to break 
them up into three classes. 

255. Penetration or Coulomb Force. — As one atom penetrates 
another, there will be forces on account of pure electrostatics, 
even if the two atoms do not distort each other. Let the outer 
shell of each atom penetrate within that of the other (Fig. 74a). 
Then the part of each which penetrates the other finds itself in a 
field attracting it toward the nucleus of the other, since it is no 

(o) (b) 

Fig. 74. — Penetration of one atom by another. Circles represent shells of 
electrons, (a) Attraction. Negative charge of each atom penetrates within 
the outer shell of the other, being attracted to the positive nucleus, (b) Repul- 
sion. Nucleus of each atom penetrates the outer shell of the other, the repulsion 
of the nuclei for each other outbalancing the attractions. 

longer shielded by all shells of the other. The result is an attrac- 
tion of the charge of each for the other, pulling the whole atoms 
together. On the other hand, as the atoms get still closer, the 
whole system of inner shells of one would get inside the outer shell 
of the other (see Fig. 746). These inner cores are both positively 
charged on the whole and will repel, a repulsion more than enough 
to counteract the attraction, in general. Hence at sufficiently 
close distance, the penetration force will be repulsive. In 
between, there will be some distance at which the force will be 
zero and there will be equilibrium. 

256. Valence Attraction. — The penetration force acts even 
though the atoms are not distorted. The force of attraction 
principally concerned in valence, however, is an additional force 
resulting from the distortion of one atom by the other. The 
distortion produced by ordinary electrostatics is at least approxi- 
mately taken care of by computing the polarization, as stated in 


Sec. 253, but there is an additional effect, resulting from the 
operation of the exclusion principle, and the existence of electron 
spins, and which leads to a tendency for electrons to form stable 
pairs, agreeing with the ideas of G. N. Lewis regarding homopolar 
valence, or valence attraction between uncharged atoms. To 
understand this, even approximately, we must look more closely 
into the exclusion principle. In addition to their charge, elec- 
trons also act like little magnets, having a north and south pole. 
This is as if the charge were to rotate, forming a little electric 
current around a circle, and corresponding magnetic lines of force. 
The result is called electron spin. Now when we have a pair 
of electrons, it turns out that their spins can be oriented in just 
two possible ways: either parallel to each other, or opposite or 
antiparallel. If they are parallel, then the exclusion principle 
comes in and says they cannot be in the same shell. But if they 
are opposite the principle does not operate. It is a result of this 
that the allowed numbers of electrons in the various groups in an 
atom are all even numbers. Thus, in the s shell, after we have 
one electron, we can add a second if its spin is opposite to the 
first, for then the exclusion principle does not act. But if we 
now try to add a third, its spin must be parallel to one of the two 
already there, and the exclusion forbids it. Similarly a p group 
really contains three different subgroups, each of which can 
contain but two electrons, with opposite spins. Analogous 
results hold for the other groups. We see, then, that the sub- 
group of two electrons with opposite spins is a configuration 
which electrons like to form, and that only two electrons can 
enter such a configuration, so that there is a tendency toward 
pairing. But now it appears that such a pair can be formed by 
two electrons in different atoms, just as well as by two in the same 
atom. Thus if each of two atoms has just one electron, rather 
than two, in one of its subgroups, and if these two electrons have 
opposite spin, they can form a pair held in common by the two 
atoms, actually localized in the space between the atoms, and 
tending simply by electrostatic forces to hold the atoms together, 
the attractions of this negative concentration of charge for the 
nuclei, which must have a net amount of positive charge, more 
than counterbalancing the repulsions between like charges at 
large distances, though at smaller distances the force becomes 
repulsive, on account of the ordinary penetration effect. This 
is the origin of homopolar valence. We see that every electron 


lacking from a closed shell can be interpreted as giving the possi- 
bility of forming a valence bond, so that for example the halogens 
have a single valence, oxygen and sulphur have two, hydrogen 
has one (one electron missing from Is), and so on. 

257. Atomic Repulsions. — If one brings two atoms close 
enough together, they will always repel and resist further 
approach. This is what we know physically as the impenetrability 
of matter. It is a result of the exclusion principle, again. If 
we force two atoms so close together that the shells of the two 
atoms overlap, and if these shells are all filled with electrons, 
then we are really trying to force more electrons into the same 
region of space than the exclusion principle allows. What, hap- 
pens is that the electrons then move outside of this region, the 
atoms become distorted, and the resulting increase of energy is 
interpreted as a force of repulsion between the atoms. These 
actions commence as soon as closed shells begin to overlap appre- 
ciably, and as a result the atoms have rather sharp boundaries, 
and for some purposes may be considered as having definite 
sizes. We should notice that, if the outer shells of the atoms are 
not closed, this repulsion can be altered. Thus, if two lithium 
atoms approach, each having a closed K shell but only one elec- 
tron in its 2s shell, either of two things can happen. If the two 
L "electrons happen to have parallel spin, then the exclusion 
principle operates between them, and they will repel each other, 
as if they had only closed shells. But if the spins are opposite, 
then the outer shells can coalesce, forming a shared electron pair, 
and resulting in attraction. Even in such a case, however, we 
finally meet repulsion as we bring the atoms together. In the 
first place, at close enough separation, the K shells would begin 
to overlap, and since they are closed shells they would repel in 
the usual way. But also the pure electrostatic interaction gives 
repulsion at small enough distances. For with more and more 
penetration, we get to the point where the nuclei are close 
together, in the midst of a combined set of shells of electrons from 
both atoms. Increasing closeness will then increase the repulsion' 
between the nuclei, without much changing anything else, and 
this repulsion will finally become great enough to cancel all other 

258. Analytical Formulas for Valence and Repulsive Forces. — 
The three types of force which we have just been discussing, 
Coulomb penetration force, valence attraction, and repulsion, 


all depend on the actual overlapping of the charge distributions 
of two atoms. Here again we can find a simple approximate 
formula, which is yet accurate enough to be decidedly useful. 
Since the charge distribution falls off in general exponentially 
with the distance, we may assume that the potential energy also 
falls off exponentially: energy = Ce~ ar , where r is the distance 
between nuclei. The constant C is negative for attractions, posi- 
tive for repulsions. The value of a, of course, will be different 
with each type of force, and each type of atom. Nevertheless, 
we can give extremely rough rules which yet suffice to give the 
order of magnitude of a. First we set up, for each of our two 
atoms, the "radius" of the outer shell, n 2 /(Z — S). We add 
these radii for the two, multiply by 1 if the electrons in the outer 
shells are p electrons, as in closed shells, but by 1.4 if they are s 
electrons in both atoms, as in a molecule made of two alkali 
atoms. Let the result be r . Then as far as order of magnitude 
is concerned, the energy is a constant times e~ i{r/ro) for the pure 
repulsion between closed shells. In the valence attraction case, 
where the curve has a minimum, we can combine the valence and 
Coulomb forces, since both behave about the same. Then the 
result is approximately 

• (J e -6(r/r ) _ C"g-3(r/ro) / 

the first term representing the repulsion close in, the second the 
attraction farther out. The constants as we have written them 
are for the normal state of the atoms and molecules, and in this 
case it is found that the equilibrium distance for the valence 
attraction comes approximately at r . This results, as we readily 
verify, by writing the formula in the form 

D [ e -<^) _ ar'fe-O} 

or more generally 

D e -2a(r-r ) _ 2De -nB(r - r ° ) , (1) 

where a is a constant, which we have set approximately equal to 
3/r . This form of potential curve has been used by Morse, and 
he has tabulated values of D, a, and r for a number of molecules, 
in excited as well as normal states. 

The constant coefficient D, or the corresponding coefficient 
in the pure repulsive energy, is not easily given in a general way. 
We can easily see its significance, however. In Fig. 75 we plot a 
Morse potential jgurve, observing that it has a minimum at r , 




the energy at this point being — D, while at infinite separation 
the energy is zero. Thus D represents the energy required to 
pull the atoms apart to infinity if they are initially at rest at the 
equilibrium distance, or, in other words, the energy of dissocia- 
tion of the pair of atoms. These energies, for actual molecules, 
vary between a fraction of a volt-electron and several volt- 
electrons, depending on the tightness of binding of the molecule. 

A few simple rules help in 
estimating D, as for instance 
that the larger r , the smaller D 
tends to be (for example, F 2 is 
more tightly bound than J 2 , the 
F atom being smaller than /); 
molecules with a double or triple 
valence bond have larger D's 
than with single bonds; etc. 

The repulsive energy between 
closed shells, which we have 
approximated by Ce~ i(r/ro) , is 
generally associated with an 
ionic or Van der Waals' attrac- 
tion, resulting again in a mini- 
mum. This minimum, 
however, is ordinarily at much 
larger distances than r , more 
nearly 2r p or even larger. This 
is in consequence of two things : 
the attractive forces are rather weaker than the valence attrac- 
tions, and second the repulsion between closed shells is naturally 
larger, and effective at larger distances, than the repulsion 
found in valence compounds. In an actual case, where we 
know the Van der Waals' or ionic force, we can then make 
an estimate of the distance of separation at the minimum, 
and find C from the condition that the correct total potential 
has zero slope at this point. To get a number comparable with 
those met in valence attraction, we should write the repulsion 

in the form De ~ 4 V™ ~ * ) . Then in actual cases D comes out of the 
order of a few volt-electrons. 

Often one finds the repulsive forces of which we have just 
spoken approximated by an inverse power of r, as b/r n , where n 

Fig. 75. — Morse potential curve, 

X) e -2a(r-r ) - 2De' a ( r_r o) . 


proves to be about 8 or 9. We immediately see that both func- 
tions, exponential and inverse power, behave similarly, being 
large for small r, small for large r, so that either form can be 
used, though, since the repulsion depends on penetration, which 
actually goes off exponentially, we can be sure that the inverse 
power term is not so accurate. We can readily find out why n 
has about the value 9. The repulsive term is of importance, 
and can be found experimentally, and n determined, near the 
minimum of the energy curve. For Van der Waals' or ionic 
forces, as we have mentioned, this proves to come at about 
2r . Then suppose that we choose b and n so that b/r n has the 
same value and slope as Ce~ 4(r/ro) when r = 2r . We have 

= Ce 


(2r )» 

nb 4 _ 4 (2iA 

(2r ) n+1 r 

from which, dividing one by the other, 2r /n = r /4, n = 8, 
approximately as is found experimentally. Many discussions, 
particularly of the structure of ionic crystals, are based on this 
inverse power formula, which has been used by Born and others. 
259. Types of Substances: Valence Compounds. — Now that 
we have investigated the types of interatomic forces, we should 
consider them with reference to the different types of substances 
in which they occur. Broadly speaking, there are two main 
types of substances, corresponding to the two principal kinds 
of interatomic attractions, the ionic and the valence forces. 
Let us arrange our valence compounds roughly in order of melting 
or boiling points, starting with the most volatile, and ending 
with the most stable. The first substances on the list are not 
compounds at all, and indicate valence only in a sort of negative 
way: they are the inert gases, He, Ne, A, Kr, Xe. Since the 
outer shells of these are already completed, they form no ions, 
and they have no electrons to be shared and have no possibility 
of valence forces, and form no compounds. Next we come to a 
group of diatomic molecules, for example H 2 , 2 , N 2 , F 2 , Cl 2 , Br 2 , 
CO, HC1, HBr, etc. These are held together by valence forces 
(HC1 and HBr are somewhat ambiguous, and might be considered 
to be ionic compounds; this ambiguity is met in almost all H 
compounds). For example, each atom in H 2 has one electron; 


they share these, making a pair. In O2, each atom has six 
L electrons; but they share two pairs (a double bond). As we go 
on, we come next to fairly simple polyatomic molecules. We 
have water, ammonia, methane: H 2 0, NH 3 , CH 4 , all rather 
plainly valence compounds (though the ambiguity of which we 
spoke previously makes an ionic interpretation possible as well), 
with each hydrogen held by a single valence bond to the other 
atom. We might well include with these the ammonium ion, 
NH 4 + , presumably built like methane. Other simple ones are 
C0 2 , CS 2 , with double bonds. Then we certainly should include 
some of the simple organic compounds, as acetylene C 2 H 2 
(triple bond between the carbons), ethylene C 2 H 4 (double bond 
between the carbons), ethane C 2 H 6 (single bond). 

All these molecules of which we have spoken are held together 
by valence forces. On the other hand, there are also Van der 
Waals' forces between molecules, though of a smaller order of 
magnitude than the valence forces, and these hold the substances 
together in liquids and solids, all of low boiling points, but of 
increasing stability as the molecules become heavier and more 
complicated. The very considerable difference in order of 
magnitude between the valence and the Van der Waals' forces 
is significant, for this brings it about that the separate molecules 
preserve their identity, even when crowded close together. 

More complicated organic compounds naturally come next 
in the list. They still preserve to some extent the property 
of existing as separate molecules, in gas, liquid, and solid, so 
that they still have both valence forces between atoms, and Van 
der Waals' forces between molecules. But as the molecules 
get more and more complicated, the Van der Waals' forces get 
larger and larger proportionally, so that with the fairly compli- 
cated ones they are of the same order of magnitude as the valence 
forces. Many comphjcated organic compounds dissociate when 
heated, rather than going through a change of state, since the 
heat necessary to melt and boil the substances becomes more 
and more nearly equal to that required to break up the molecules. 
It becomes, in other words, harder and harder to distinguish 
separate molecules, the solid acting more and more like a single 
big molecule. 

The silicates form a group of compounds slightly suggesting 
the organic compounds in their complexity. They contain the 
group Si0 4 -4 , which can be best described as a pure valence 


compound, Si(0 -1 )4, held together just like methane, Si being 
analogous to C. In many compounds the silicate groups are 
joined together, by sharing oxygens, as in the double group 
Si 2 7 -6 , or (0 -1 ) 3 Si-0-Si(0 -1 )3, a neutral O atom being joined 
by its two bonds to the two Si atoms. This process of sharing 
oxygens may continue, until finally there is a network formed 
through the whole crystal, the metallic ions, as Ca++, etc., merely 
fitting into empty space in the network, and all traces of molecu- 
lar structure being lost. Thus these crystals are held together 
by forces so strong that they are not easily broken up. They are 
insoluble and refractory, and in fact form a great proportion 
of all the minerals. 

260. Metals. — The metals form a type of substance more or 
less by themselves, but in general resembling valence compounds. 
There is a definite indication, at least in some of them, that there 
is a network of valence forces between the atoms, running through 
the metal, and holding it together to form a solid. At the same 
time, the simple Coulomb penetration force seems to account 
for a considerable part of the cohesion of metals. The network 
of valences seems to be connected with the electrical conductivity: 
an electron shared between two atoms can go to either one, and 
if the sharing exists through the solid, the electrons can migrate 
and carry a current. For many purposes, it is more correct 
in a metal to give up the idea that an electron is attached to a 
given atom at all, and treat them as free to move from one place 
to another, like the molecules of a perfect gas. The typical 
metallic states are solid and liquid. When a metal is vaporized, 
the tendency toward molecular formation does not seem to 
be strong. The vapors of such metals as have been examined 
show both monatomic and diatomic molecules; one wonders 
if polyatomic ones would not also be found if the experiment were 
made, acting simply like little pieces of the large metallic crystal. 

261. Ionic Compounds. — The ionic compounds are not so easy 
to classify in a definite order as valence compounds, principally 
because they are more alike. The primary fact about ionic 
compounds is that they are held together by electrostatic forces, 
the atoms appearing in the ionized state. The forces between the 
atoms depend only on the distance, and are independent of 
the presence of other atoms (except in the matter of polarization). 
The laws governing the formation of ionic crystals are simple 
electrostatic ones, such as that positive and negative ions tend 


to approach as closely as possible, ions of the same sign go as far 
apart as possible, charges in small volumes tend to equalize them- 
selves, and so on. As a particular result of these, there is no 
tendency to form molecules. It is almost impossible to build up 
out of ions any structure which would not have large electrostatic 
fields around it; and further ions would be attracted by these 
fields, so that the substance can build up indefinitely. Further, 
the electrostatic fields are rather large, compared with the valence 
forces. The physical nature of these substances follows from 
the principles very easily. Their most characteristic form is the 
solid, where they form crystals in which the ions are arranged on 
a regular lattice. There is no trace of molecular structure in the 
lattice. They are hard and stable, often harder than metals, 
and of high melting point, although, of course, there is large 
variation from one compound to another. The vapor phase is an 
unimportant one for practically all ionic substances. Much 
more interesting in general than either liquid or vapor is the ionic 
state in water solution. Water, on account of its great dielectric 
constant, decreases all electrostatic forces. It thus almost 
removes the forces holding such a crystal together, and the solid 
breaks up into ions dissolved in the water. 

When we ask about individual ionic compounds, we can well 
classify according to the ions from which they are made. The 
fundamental building stones are in every case ions of atoms; and 
the ions are of two sorts, positive and negative. The metals 
practically always form positive ions. They easily lose their 
valence electrons, as we have seen, so that all the electrons outside 
closed shells are removed, giving the alkali ions a charge 1, alka- 
line earths 2, the aluminum group 3, and so on. As we go through 
the series of elements, we see that even the nonmetals sometimes 
form positive ions, as CI with seven positive charges. Sometimes, 
however, their ions are negative, though about the only important 
atoms forming negative ions are O and S, forming singly and 
doubly charged ions, and the halides F~, Cl~, Br" I~. These 
atoms add electrons to make a closed shell, instead of losing them. 
It is obvious why there are so few: adding electrons makes an 
atom negatively charged, so that it tends to repel other electrons. 
It is a process which cannot go on far. The negative halide ions 
generally exist by themselves. The oxide ion also exists by itself 
in oxides; but it also forms complex negative ions, with positive 
nonmetallic ones, which are the most important negative ions 


known. There are two alternative explanations of these radicals, 
either as pure ionic compounds, or as a combination of this with 
valence forces. For example, the sulphate ion can be regarded 
as being formed from a completely stripped sulphur ion and 
doubly charged oxygens: S0 4 -2 = S +6 (0 _2 ) 4 . But if we assume 
that the oxygens have only single negative charges, we have the 
other possible structure S +2 (0 _1 ) 4 . With this structure, the 
sulphur has four electrons, as carbon does, and so has four homo- 
polar valence bonds; and the oxygens have the same electron 
structure as halogens, with a single valence bond. Thus the 
sulphur can be bound to the four oxygens by valence bonds, 
assisting the electrostatic attraction, and the structure would 
have similarity to methane or carbon tetrachloride. This latter 
explanation seems to be nearer the truth, since it can be calculated 
that the work required to form the completely stripped positive 
ion in the ionic model would be much greater than the work 
necessary to form the other structure. 


1. Find the potential energy between two helium atoms, using our approxi- 
mate methods for calculating Van der Waals' and repulsive forces, and com- 
pare with the more accurate value 

\ 7.7e- 2 " 3 V«o _._l^*f 10-10 ergs, 

where a = 0.53 X 10 -8 cm. The polarizability of helium is 1.43a 3 , and 
its ionization potential 1.80 Rh. Compare these with simple calculated 

2. Using the potential of Prob. 1, compute the equilibrium distance of 
separation between two helium atoms, and find the energy of dissociation, 
in ergs, and volt-electrons. Compare the equilibrium distance with the 
mean distance in the liquid, which has a density of 0.14, assuming atoms to 
be spaced on a regular lattice, so that the mean distance will be l/^n, if n 
is the number of atoms per cubic centimeter. 

3. Find a radius of the helium atom for use in kinetic theory, assuming 
that two helium atoms at temperature 300° abs., with kinetic energy of 
%kT, collide head on. Find how close they come before they stop, and 
compare this molecular diameter with the distance r . 

4. Two energy levels of H2 coincide with the lowest energy of the atoms 
at infinite separation, one an attractive level (corresponding to valence 
binding, with the spins of the two electrons opposed), and one repulsive 
(the spins being parallel, so that the exclusion principle operates). Plot 
the energies of both terms as functions of distance, deriving the exponents 
according to our approximate laws, and determining the scale from the 
fact that the energy of dissociation of the molecule is about 4.3 electron volts, 



and that the energy of the repulsive term at the distance of molecular 
equilibrium is about 8 electron volts. 

5. Compute by our approximate laws the distance of separation r of 
the atoms in the normal states of the valence compounds given below, and 
compare with the experimental values tabulated : 


r (Angstroms) 


r (Angstroms) 

c 2 

H 2 


I 2 

o 2 



6. Compute by our approximation polarizabilities for the following ions, 
and compare with the experimental values tabulated : 


a X 10 24 


. a X 10 24 

















Mg ++ 




7. Compare the distance of separation of atoms in the metallic crystals 

n 2 

tabulated below with the sum of the quantities 

z -s 

for the two atoms. 








8. Compute the interatomic potential energy for NaCl at large distance, 
assuming it is composed of Na + and CI - , so that there will be the ionic 
force, and at the same time a polarization force, the sodium polarizing the 
chlorine. Show that the polarization of sodium by chlorine can be neglected. 
Using the polarizabilities of Prob. 6, show that the potential energy is 

-, t-. — r-. electron volts. 

|_ r/oo (r/a ) 4 J 

9. The observed interatomic distance in the NaCl molecule is 2.73 Ang- 
stroms. Compute the constants C and a in the repulsive potential Ce~ ar , 


Find a by the rules we have used, and determine C so that the sum of the 
repulsive potential, and the attractive potential of Prob. 8, will have a 
minimum at the required distance. 

10. Using the value of a found in the preceding problem, find the equiva- 
lent value of n in the repulsive potential b/r n for the NaCl problem, seeing 
how nearly it equals 9. 



In the preceding chapter, we have considered interatomic 
forces, and their effect in determining the nature of substances. 
When we begin to think more precisely of what we mean by the 
nature of substances, we conclude that the equation of state, and 
the closely related specific heat, are among the most important 
properties. We shall, therefore, take them up, giving necessarily 
enough thermodynamics and statistical mechanics to make 
calculations possible. Our investigations will be concerned with 
the thermal motion of the nuclei, moving under the interatomic 
potential which we have investigated. We shall naturally not 
be able to treat all sorts of substances; liquids, for instance, are 
so complicated that comparatively little progress has yet been 
made in understanding their properties. But gases and crystal- 
line solids both present features of simplification which we can 
make use of. 

262. Gases, Liquids, and Solids. — Before passing to our 
analysis, let us consider what types of behavior we wish to explain. 
We can conveniently divide our discussion into gases, liquids, 
and solids. A monatomic gas, as an inert gas, is the simplest 
case: we have only to find its pressure) and total energy, as a 
function of volume and temperature, a task which can be carried 
out when we know the law of force between molecules. Gases 
of valence compounds, however, are more complicated. Their 
equation of state is not much harder to approximate than with 
monatomic gases, at least at low density, for on account of the 
rotation of the molecules they act on the average as if they were 
spherically symmetrical, and we need use only the intermolecular 
force averaged over angles in deriving the equation of state. In 
the specific heat, however, there are two forms of energy to con- 
sider: the translational kinetic energy of the molecules as a whole, 
which acts just as in monatomic gases, but also the rotational 
and vibrational energy of the individual molecules. This involves 



a different sort of calculation. A still further complication 
appears in gases of some ionic substances, and of some valence 
compounds like I 2 and NO. Here there are several types of 
molecule which can be simultaneously present in the gas, as 
21 +± I 2 , 2Na ^ Na 2 , and a proper treatment of the equation of 
state and specific heat would demand investigation of the equi- 
librium concentrations of the constituents, and their change with 
pressure and temperature. 

A liquid is more complicated than a gas, in that the molecules 
are so closely in contact that they can no longer be treated as 
points. The liquids of the inert gases are, of course, exceptions, 
and there are a few other exceptions, diatomic and polyatomic 
substances whose molecules rotate even in the liquid, and so act 
like spherical systems. But with most liquids the molecules are 
bulky enough so that they do not rotate, and are definitely non- 
spherical in their average behavior. In considering the equations 
of state, in particular the compressibility, one can no longer, as 
with a gas, neglect the change of volume of the molecules with 
change of pressure. As the molecules become larger and larger, 
as with complicated organic compounds, the distinction between 
forces within and forces between molecules becomes lost, and 
the whole liquid must be treated as a single complex, the volume 
being determined more and more definitely by the space required 
to pack the atoms together. 

The state of close-packing of atoms which we have just men- 
tioned is definitely reached with solids. In fact, with noncrystal- 
line solids, there is no sharp distinction between the states, as 
glass for instance shows, solidifying perfectly continuously from 
the liquid. The solids with definite melting points are the 
crystals, which *have a definite lattice arrangement of the atoms 
which is not met in the liquid. This regularity of arrangement 
is the simplifying feature which makes it possible to treat crystals 
theoretically. We can here commence our discussion with the 
state at absolute zero of temperature, where the atoms are at 
rest, and the whole crystal is in a position of equilibrium of the 
interatomic forces. The compressibility of such crystals can 
be fairly easily found from the forces, and this has been carried 
through particularly successfully for some of the ionic crystals. 
Then we can treat the crystal in thermal agitation by investigat- 
ing the small oscillations of the atoms about their positions of 
equilibrium, using the method of normal coordinates. This 


makes it possible to consider both equation of state and specific 
heat with fair ease and generality. 

Out of all the group of topics which we have suggested, there 
are a few which can be treated theoretically fairly successfully. 
First, there is the equation of state of rare gases, or of polyatomic 
gases whose molecules rotate so as to be spherically symmetrical 
on the average. This is what we take up in the present chapter. 
Secondly, there is the specific heat of rotation and vibration of 
molecules. Thirdly, one can consider the equilibrium between 
different types of molecules in a gas, the question of chemical 
equilibrium. Fourthly, the equation of state and specific heat 
of crystalline solids can be investigated. As a preliminary to 
these, we must extend our treatment of statistical mechanics, 
which we have already considered slightly in Chap. XXX. We 
first follow out the ideas of classical statistics a little further, 
treat the equation of state of a gas by those methods, and then 
go to quantum statistics, asking what changes are introduced. 

263. The Canonical Ensemble. — Following Chap. XXX, we 
consider a phase space; that is, a space in which each coordinate 
and each momentum of the system is plotted as a variable. 
Let the coordinates be q x . . . q n , the momenta p x . . . p n , 
the Hamiltonian function H(q x . . . p„). Then the phase 
space has 2n dimensions, and a point in this space represents a 
whole system (for instance a sample of gas). Next we set up 
an ensemble of points in this space, the number in the volume 
elemental . . . dp n being proportional to f(q x . . . p n )dq x . . . 
dp n . We assume all points of the ensemble to be equally likely; 
that is, we assume that the probability that the coordinates 
and momenta of the system actually lie in the region dq x . . . dp n 
is proportional to the number of points of the ensemble in this* 
region, or is proportional to f dq x . . . dp n . Then to find the 
average of any function of the coordinates and momenta, as 
F(q x . . . p n ), we multiply by /, integrate, and divide by the 

integral of /: F = r// gl ' ' 'j Pn > as we saw in Chap. XXXI. 
ifdqi... dp n 

Now in particular we set up the canonical ensemble, 

ff(gl . ■ . Pn) 

f{qi • ' ■ Pn) = constant e kT , 

where T is the absolute temperature. This ensemble gives the 
probability that a system in thermal equilibrium at temperature 


T will have its coordinates and momenta within given limits. 
The essential physical reason for this is the following: Suppose 
we have two systems, the first of coordinates and momenta. 
?i . . . ff», Pi . . . Vn, the second q n+ i . . . q m , p^ . . . p m , 
with the separate Hamiltonian functions Hi(q! . . . p n ), H 2 (q n +i 
. . . p m ). Then physically we know that, if 1 and 2 are at the 
same temperature, and are then allowed to interact slightly, 
as by interchanging energy, it will be found that they are already 
in equilibrium with each other, and they already form a combined 
system in equilibrium at this temperature. This, in fact, is 
the definition of equality of temperature. But this is satisfied 
for the canonical ensemble. Thus if the separate systems are 
in equilibrium, their distribution functions are 

Hi(qi . . . p„) 

fi(qi " ' ' p n ) = constant e kT 

Hi(q„ + i . . . p m ) 

/ 2 (<7„ + i - • • p m ) = constant e kT 

By the laws of probability, then, the probability that simultane- 
ously the coordinates q x . . . p n will be in the range dqi . . . dp n , 
and that q n+1 . . ; . p m will be in dq n+x . . . dp m , is proportional to 

the product of these probabilities, or constant e kT dq x . . . 
dp n dq n+ i . . . dp m . But now suppose that the two systems are 
allowed to interact. The combined system will have an energy 
Hi + H 2 + H', where H' is a small interaction potential, 
depending perhaps on all coordinates and momenta, negligibly 
small compared with the separate energies (as for instance an 
interatomic force between the negligibly small number of 
molecules on the boundary between the two systems, which 
permits the flow of heat between them). Then according to the 
canonical ensemble, the distribution of the whole combined 

{Hi + Hz + H') 

system in thermal equilibrium should be constant e kT 

But we observe that, except for the negligible energy H', this 
is just the distribution before the interaction, so that the two 
systems were already in equilibrium before the interaction, and 
by definition are at the same temperature. This result Is true 
only with the canonical ensemble, since it depends on the expo- 
nential form, adding exponents being equivalent to multiplying 
the functions. 

Suppose we choose the constant in the definition of the canoni- 
cal ensemble so that // dq x • • • dp„ = 1, and avoid having to 


bother with the denominator in taking averages; this corresponds 
to normalizing a wave function. Further, let us write the con- 


e kT 

stant in the form -r— > so that we have 
h n 

f(qi ••■?») = 




h n 


r 1 f lnH 

f(Qi • • • Vn) dqi ■ ■ ■ ■ dp n = 1 = ^J e kT dq x • • • dp n . (1) 

Here F is a quantity of the dimensions of energy, a function of T, 
chosen to make the constant have the correct value. Since 
e F/kr j s dimensionless, and since the function / must have the 
dimensions of I /{dqx . . . dp n ), in order to make its integral dimen- 
sionless, we must multiply by a constant of these dimensions. 
We have chosen l/h n , which has the correct dimensions, since 
h is of the dimensions of pq. It is a purely arbitrary matter 
that we have chosen this particular constant/since in all ordinary 
physical applications the constant drops out anyway, and it 
does not imply the introduction of quantum theory into classical 
questions. We shall later see, however, that it simplifies the 
comparison with quantum theory to have it there. 

264. The Free Energy. — Let us take the factor e F/kT out of the 
integral above (it does not depend on the q's and p's), and 
divide througn by it. Then we have 

e hT = F n I e kTdqi ' ' ' dpn ' (2) 

The integral on the right is often called the integral of state (we 
shall later see cases where it degenerates to a sum, called the sum 
of state). It is fundamental in thermodynamic applications. 
The quantity F is the free energy, and we proceed to investigate 
its properties. We have seen that it depends on the tempera- 
ture; but we must also observe that it depends on the volume. 
To see how this comes about, let us think about the Hamiltonian 
function H, in particular for a gas. We are considering only 
the nuclear motion, so that H includes the kinetic energy of the 
nuclei, and the potential enersy of the interatomic forces, as 


discussed in the last chapter. But it also includes another term, 
if the gas is in an enclosure: the repulsion of the wall. The 
molecules of the gas, as they strike the wall, are repelled, so 
violently that they never penetrate the wall. We may say 
approximately that the potential energy becomes rapidly infinite 
as any molecule approaches the wall, and is infinite if any 
molecule is outside it, so that e~ (H/kT) is zero in that case, and 
there is no probability of finding one of the molecules outside. 
Now this term in the potential depends on the volume of the 
vessel, the rapid rise of potential coming at the edge of the 
volume, which is adjustable. Thus we have H(q x . . . p n , v), 
where v is the volume, so that the free energy, which depends 
on an integral of this quantity, also depends on v as well as 

Let us investigate the rates of change of the free energy with 
respect to volume and temperature. We have 


kT 1 J 

h n J 

1 dH 

kT dv e 

dF _~d~H 

dv dv 

where we remember the formula for finding the average of any 
quantity. Now consider a cylinder filled with gas, closed with 
a piston of unit area. If we decrease the volume, the increment 
of volume being — dv, which therefore equals numerically the 
displacement of the piston, and if the pressure, and therefore 
the force on the piston is p, we shall do the work — pdv on the 
system. This will represent the increase in energy of the system, 
or dH. Hence we have dH/dv = — p. We may consider this 
relation as stating that p is the generalized force connected with 
a generalized coordinate v, and therefore equal to the negative 
derivative of the energy with respect to this coordinate. Per- 
forming the average, We then have 

\dv J, 

g)--p. O) 

Next we can differentiate the free energy with respect to 
temperature. We have 



(«'")- (W* ~Wrw) e~* T " Uw> e '"^ ■ dp - 


dF — 

If we define H, the mean energy of all systems of the ensemble, 
as the internal energy E, we have 



the familiar Gibbs-Helmholtz equation. 
From Eqs. (3) and (4), we have 

dF = - V dv- ^-jr^dT. (5) 

Now let us define the entropy S by the equation 

F = E-TS,S = ^=-?- (6) 

Differentiating, this leads to dE = dF + TdS + SdT. Simi- 
larly, Eq. (5) becomes dF = -pdv - Sdt. Combining, we are 
led to 

dE = TdS - pdv. (7) 

Equation (7) is the fundamental equation of thermodynamics, 
which we have derived from statistical methods. For the first 
law of thermodynamics is dE = dQ - pdv, where dQ is the heat 
absorbed in a process, pdv is the work done by the system. And 
the second law of thermodynamics is that for a reversible change 
(as our change is, since we assume that the distribution is always 
given by a canonical ensemble, which means that it is always in 
equilibrium), the quantity dQ/T is a perfect differential, dS 
Combining these statements, we have Eq. (7). 

The specific heat can be found immediately by differentiating 
the energy with respect to temperature at constant volume. 
Using Eqs. (4) and (7), it is 

*-($.-<$.-'<&■ (8) 


Thus we can find the specific heat, as well as the equation of 
state, by differentiating the free energy. This makes it a very 
useful function, and its calculation, by means of the integral 
of state, is the usual method of deriving information about 
physical properties of substances. Of course, we could derive 
the same information from the energy itself as a function of 
volume and temperature, but it is not quite so convenient to 

265. Properties of Perfect Gases on Classical Theory.— Let us 
apply the method of the free energy to the calculation of the 
equation of state and specific heat of a perfect gas, on classical 
mechanics. Let there be N molecules, each of mass m, so that 


H = 2i 2^ + v > 

i = 1 

where V, the potential energy, is zero so long as all molecules 
are within the volume v, but becomes infinite if even one molecule 
strays outside. Then we have 

P 2 xl - PzN- 

fe kT dq x ■ ■ ■ dp N = J*^ e 2mkT dp xl ■ • • f\ e 2mkT dp iN 

f je~ h ~ T dx l • ■ ■ dz N . (9) 

Now by direct integration each of the integrals over the p's 
is simply \/2irmkT. The integral over coordinates is the integral 
of unity over all regions where the coordinates are inside v, over 

the outside, so that it is I I I dx x dy\dz\ • • J I J dx N dy N dz N — 

V V 

v N . Thus we have finally 

e -f T _ (vwryv, (]0) 

a function of temperature and volume as it should be, and the free 
energy itself is 

F = -SNkT In ^' 2 ™ kT - NkT In v. 

I rem this we have at once p = NkT/v, giving the ordinary law 
of perfect gases, and C v = %Nk, likewise a well-known result. 


266. Properties of Imperfect Gases on Classical Theory. — 

Next let us consider an imperfect monatomic gas, such as an 
inert gas. This differs only in that there is an additional term 
in the Hamiltonian, a sum of interaction energies of each pair 

of atoms : H = kinetic energy + ^S Va + repulsion of walls, 

pairs i,j 

where Va may be the sum of a Van der Waals attraction between 
the ith. and jth. atoms at large relative distances, and an expo- 
nential repulsion at small distance. We then have 

fffdxxdyidzi • • • JJjdx N dy N dz N e ^-f kT . (11) 

The integration over the coordinates can be carried out in steps- 
First we integrate over the coordinates of the Nth molecule. The 

quantity e kT , can be factored : it is equal to 

S' _ s ViN 

e kT e i^N kT 

where 2' represents all those pairs which do not include the Nth. 
molecule. The first factor then does not depend on the coordi- 
nates of the iVth molecule, and may be taken outside the integra- 
tion over its coordinates, leaving 


X ViN 

dx N dyNdz N . 

We rewrite this as 

= v-W, (12) 

f f fdx N dy N dz N — Cfifl — e- S-^f ) dx N dy N dz N = v 
v v ^ ' 

the first term being simply the volume, the second term being an 
integral to be evaluated. To investigate W, let us imagine all 
the molecules except the iVth being in definite positions of space. 
If the gas is rare, the chances are that they will be well separated 
from each other. Now if the point x N y N z N is far from any of these 
molecules, the interatomic potentials V iN will all be small, and 


the integrand will be practically 1 — e° = 0. Thus we have 
contributions to this integral only from the immediate neighbor- 
hood of each molecule. If all are alike, each of these contribu- 
tions will be equal to 


= Jjf(} ~~ e~TF) dx N dy N dz N , 

a quantity which, though it formally involves the index i, actually 
is independent of i. In fact, if we imagine the ith. molecule to be 
located at the origin, and remember that Vi N is a function of r, 
the distance from the origin, we see at once that 

w = J^°°4xr 2 (l - e~W) dr, (13) 

where we integrate to infinity instead of to the boundary of the 
vessel because the integrand is so small for r's larger than mole- 
cular dimensions that it makes no difference. In terms of this, 
we then have 

W = (N - l)w. (14) 

Now when we integrate over the coordinates of the (N — l)st 
particle, we have just the same situation over again, except that 
there are only (N — 2) remaining molecules, and so on. Thus 
finally we have for the integral over coordinates 

[v - (N - l)w][v - (N - 2)w] ■ • • v. 

We can easily evaluate this product, by taking its logarithm, 
which is what we want anyway. This is 

N-l N-l N-l 

V In (y — sw) = Vlny+ ^ln(l — sw/v). 

s = s = s=0 

The first term is N In v, which we should have for the perfect 
gas. For the second, we note that on account of the rarity of the 
gas, sw/v is always small compared with 1. Hence we have 
In (1 — sw/v) = — sw/v approximately, and the sum is approxi- 

mately equal to the integral with respect to s, or 

o v 

(N — l) 2 w 
— ^— To this order, then, neglecting unity in compari- 
son with N, we have 

F = -3NkT In y^l - NkT In , + *™* (15) 
h 2v 


w iU , NkT , N*wkT . tu- • * + 

We then have p = -^-5 h • • • . This is often 

written in the form 

PL = 1 + ^ + • , . (16) 

RT + 2v + '. { } 

where R = Nk. This expression pv/RT is called the virial, and 
the coefficients of its expansion in inverse powers of the volume 
are called the virial coefficients, so that Nw/2 is the second virial 
coefficient. The results of experiments on imperfect gases are 
ordinarily given as tables of the virial coefficients as functions of 
temperature, and by the equation above we can compute the 
second coefficient, finding w if necessary by numerical integration 
from V{r). In addition to the pressure, we can, of course, find 
the specific heat, and it immediately comes out the same as for 
the perfect gas. We must remember, however, the rotational 
and vibrational specific heats of the polyatomic gases, which 
must be added to the translational terms to get the total specific 

267. Van der Waals' Equation. — There is a limiting case in 
which we can compute w approximately. This is the case where 
the attractive part of V(r) varies slowly with r, while the repul- 
sive part varies so rapidly that it can be considered zero if r is 
greater than r , infinite if r is less. This is what we should have 
if the molecules were rigid spheres of diameter r , attracted by the 
Van der Waals' attraction. If we let V (r) represent the attrac- 
tion, we have 





e kT 






if r 


r . 

The integral then is 

w = P°47rr 2 (1 - 0) dr + f °°4xr 2 [l - e-"j?r]dr. 

The first term is simply f7ir 3 , the volume of a sphere of radius 
r , or eight times the volume of the sphere of diameter r which 
represents a molecule. In the second integral, we may expand the 
exponential as a power series, since V is relatively small : it is 1 - 

[1 _ (y /kT) • •] = Vo/kT. Thus this term is ^ J ^rr 2 V dr 
+ • • • . If, for instance, we have the type of Van der Waalf' 


force considered in the last chapter, we have V = —fi/r 6 , where 
/? = an 2 . Then the term is — (47r/3/3r 3 A;7 T ). In this case, the 
second virial coefficient becomes 

Nw #/4 3 \ _ 2Nrfi 
2 2\3 7rro / 3r 3 kT 

the further terms being in higher inverse powers of T. We may 
write this 

h A 

where b is four times the volume of all molecules, A = 2JW/3/3r 3 . 
Actual gases have second virial coefficients which agree well with 
this formula. The pressure, in other words, is given by the 

' VL _ i . & d_ 4. . . . (\i\ 

RT 1+ « RTv~* * ^ } 

being greater than for a perfect gas for large T (the b/v term 
preponderating), and less for small T. Physically, at high tem- 
perature, the finite size of the molecules, given by 6, decreases 
the apparent volume, which produces an increase of pressure; 
while at lower temperatures the attractions between molecules, 
given by A, pull the gas together. 

There is a very well-known equation, Van der Waals' equation, 
for the pressure of an imperfect gas. / This is 

(* + £)<• 

6) = RT (18) 

This differs from the equation of state of a perfect gas in two 
respects : in having the volume (v — b) in place of v, as if the mole- 
cules took up space, and in having the pressure increased by the 
amount A/v 2 . The arguments used to deduce the equation are 
not reliable, and it cannot be regarded as more than a very useful 
empirical formula. But as far as the second virial coefficient is 
concerned, it is correct. If we compute pv/RT from it, and 
expand in inverse powers of volume and temperature, we can at 
once show that the expansion is what we have already found, as 
far as the term in 1/v, the values of b and A agreeing with those 
we have already given. The higher terms in the expansion, 


however, do not agree with what we should get by correct 

268. Quantum Statistics. — Distribution functions, and hence 
canonical ensembles, have a rather different meaning in quantum 
theory from what they have in classical mechanics. For on 
account of the uncertainty principle we can no longer specify 
both coordinates and momenta, and hence cannot give functions 
of the q's and p's. Instead, as we have seen, we deal with a wave 
function \p, such that \p\p gives the probability of finding the 
system at a given point of space. We could set up the corre- 
sponding quantity in classical statistics: if f(qi . . . p n ) is the 
ordinary distribution function, normalized so that its integral is 
unity, then/ . . . jf(qi ... p„) dpi . . . dp n would give a func- 
tion of the q's, giving the probability of finding the system with 
given g's. Thus we should have the correspondence 

/ • • • Jf(3l ' • ' Pn) dpi • • ' dp n ~ ftp, 

the two quantities agreeing at any rate in the limit of large quan- 
tum numbers, where classical and quantum theory approach each 

It is not difficult to show that this correspondence holds, at 
least with one degree of freedom. First, we consider micro- 
canonical ensembles, ensembles in which all systems have the 
same energy, but are distributed in phase as if they had started 
off at all arbitrary instants of time. In such a case, with one 
degree of freedom, the probability of finding a system in a given 
range of coordinates is proportional to the length of time a system 
would stay in that range, or is inversely as its velocity. But 
now the corresponding quantum ensemble is one in which all 
systems are in the same stationary state. And using the Wentzel- 
Kramers-Brillouin method, we have already seen, in Chap. 
XXIX, that ftp is approximately proportional to 1/y/E — V, or 
inversely as the velocity, so that in this case we actually have the 
correspondence we desire between classical and quantum theories. 
The same thing can be shown with more than one degree of 

Now any kind of classical ensemble which is independent of 
time can be made up of microcanonical ensembles; we may regard 
it as consisting of a certain distribution on each energy surface. 
The corresponding situation is a quantum state in which all 
stationary states are excited at once, represented by a wave 


function Vc^e h . The corresponding density, averaged 


over the rapid time fluctuations, is 2jCkCkUkUk, corresponding to 


a fraction CkCk of all the systems being in the fcth stationary state, 
or belonging to the particular microcanonical ensemble having 
energy E k . Let us see what is the classical ensemble correspond- 
ing to this combination. We may approximate it in the following 
way. Let us imagine the energy surfaces corresponding to the 
stationary states drawn in the classical phase space. Then let 
a fraction c^k of the systems of the classical ensemble be uni- 
formly distributed through the region between the kth. and 
(k + l)st energy surfaces, rather than just on the energy surfaces. 
We do this to get a continuous function. Then evidently the 
density of points between the kth and (k + l)st surfaces will be 
c^k divided by the volume of phase space between these sur- 
faces. This volume, as we have seen, is h n . Then we have the 

/(?i ' • ' Vn) ~-j^ between E k and E k +i. 

This gives a step-like function for /, which would approach con- 
tinuity as the stationary states got closer together. Now it is 
plain how we are to set up a canonical ensemble: we are to set 
c k c k proportional to e~ Ek/hT , and this will then give the right 
variation for /. Of course, our correspondence is not exact, but 
we assume that the quantum canonical ensemble is the exactly 
correct thing, the classical one the approximation to it. This is 
justified by the fact that we can give just the same argument for 
the canonical ensemble's representing thermal equilibrium in 
the quantum theory that we could in classical theory, and we 
know quantum theory to be the correct form in cases where it 
differs from classical theory. 

Having the canonical ensemble in quantum theory, we can now 
proceed to the calculation of the free energy and equation of 
state as we did in classical theory. To get exact correspondence, 
we should set 

F-H F-Ek 

/(ffi •'•?">= if = ")F - IF 


Now the integral / . . . Jfdqi . . . dp n goes over into a sum over 
all stationary states, multiplied by the volume of phase space 
associated with each stationary state, or h n . Thus we have • 


-E k 

k k 

and finally 

e kT = ^ e tr ( 19 ) 

In the case of degeneracy, where there are several stationary 
states of the same energy, the sum in Eq. (19) includes a term for 
each state, so that for an energy level with g states, we have g 
times the contribution from a single level of the same energy. 

269. Quantum Theory of the Perfect Gas. — We have already 
shown the correspondence between the classical and quantum 
expressions for free energy, to the approximation to which the 
Wentzel-Kramers-Brillouin method is accurate. This shows us 
that, for both the perfect and imperfect gases, we may expect to 
find about the same equation of state and specific heat on both 
theories. The errors in the method are large only when the wave 
length is changing very rapidly, and this actually comes, in this 
problem, only when two molecules are in collision with each other, 
or are colliding with the walls. Accurate discussion shows that 
there are appreciable corrections to the classical equation of 
state introduced in this way for the lightest gases (which there- 
fore have longest wave length for a given velocity), but even 
these are small, and difficult to discuss. It is easy, however, to 
carry through the exact solution of the quantum theory of the 
perfect gas, and this will suffice to show the general situation. 

Let the gas be confined in a rectangular volume of sides A, B, 
C. Then the wave functions for single molecules satisfying the 

boundary conditions of being zero on the boundaries are sin ^~ 

sin ^-jjr- sin ~> where p, q, r are integers. A wave function 
Jo C 

for the whole gas can be built up from this by multiplying 

together functions for all the molecules, obtaining 


. Viirxi . t n tx n 

u = sm £ -^ — • • • sin — -~ — 

Substituting in Schrodinger's equation, 

where V is the same potential of repulsion of the walls which 
we have considered before, we at once have 

^8^r-^ + -- + -^> (2o) 

To get all states, we must take all combinations of the integers 
pi . . . r N , each going from one to infinity. Thus we have 

Pi=l r jv =1 

h*pi 2 _~ h*-r. 

= ^ e ~SAhnkT . . . ^ e~8C*mkT. (21) 

pi=l r w = l 

Now at reasonably high temperatures, 7 1 is so large that we have 
to go to large values of the integers p, etc., before the exponential 
begins to fall off appreciably. Thus the terms of our summation 
differ only slightly from each other, and we can replace them by 
an integral, one factor being 

Thus we have 


(ABCr (v^ty _ ,v(v«rf , 

where v = ABC is the volume of the gas, agreeing exactly with 
the classical value, so that equation of state and specific heat 
are not altered by using the quantum theory. At lower tempera- 
tures, where we cannot replace the summation by an integration, 
there will be discrepancies; the gas here is said to be "degenerate." 
At the same time other features enter the situation, different 


sorts of statistics known as the Bose and Fermi statistics, which 
we shall discuss later in other connections. We shall not work 
out the case of degeneracy here, since practically one cannot 
reach such low temperatures without liquefying the gas, and 
since we shall meet in the next chapter some corresponding 
situations in solids, which are actually attained, and are of much 
more physical interest. 


1. For neon, experimentally, 6 = 20.6 c.c. for a mol. Find the equivalent 
diameter r of the atoms, regarded as rigid spheres. Compare this with the 
sum of the quantities n 2 /(Z — S) for the two atoms. 

2. Using our approximate methods of dealing with Van der Waals' 
attraction, and using the value of r from Prob. 1, compute the constant 
A for neon. Compare with the experimental value of 0.21 X 10 12 absolute 
units (you cannot expect very good agreement) . 

3. Using the experimental values of b and A for neon, from Probs.l and 
2, draw a graph for the second virial coefficient as function of temperature. 
At what temperature does the graph cross the T axis, and what does this 
mean physically? 

4. Carry out the expansion of Van der Waals' equation in virial form, 
showing that the second virial coefficient is as we have found, and compxiting 
the third virial coefficient as well. 

5. Using Van der Waals' equation, plot a number of isothermals (lines of 
constant T, p being plotted against v). Choose both low and high tem- 
peratures. Use the constants given in Probs. 1 and 2 for neon. Note 
that at low temperatures the isothermals have a maximum and minimum, 
while at high temperatures they do not. As is well known, this maximum 
and minimum are not really present, but the region in which they occur 
is that in which gas and liquid are in equilibrium and exist as a mixture. 

6. The critical point is that point where the maximum and minimum of 
the isothermals of Van der Waals' equation coincide, or where the first 
derivative of pressure with respect to volume at constant T has a double 
root. Compute the critical pressure, temperature, and volume, for neon, 
using the constants given in Probs. 1 and 2. 

7. Hydrogen gas is confined in a container 10 cm. on a side. Find the 
order of magnitude of the temperature at which it would become degenerate; 
that is, the temperature at which most of the molecules would be in the 
lowest quantum state. 

8. Compute the internal energy and entropy of a perfect gas by the 
classical theory. 


In the last chapter, we have seen that in addition to the 
equation of state of gases, there was another range of phenomena 
which we could treat satisfactorily: the phenomena resulting 
from nuclear vibrations in molecules, leading to the vibrational 
specific heat, and in solids, leading to the equation of state and 
specific heat. The mathematical methods used in dealing 
with them are similar, so that they can be profitably treated 
together. At the same time, the question of the stationary 
states of vibrating molecules is of interest in itself, and can be 
easily taken up. 

We shall begin with the problem of a crystalline solid; the 
extension to a molecule, which after all is not very different from 
a fragment of such a solid, is not hard to make. Our problem 
is to find pressure as function of temperature and volume, and 
specific heat. Ordinarily the measurements of the equation 
of state of a solid take the form of measuring the compressibility 
and thermal expansion: we express volume as a function of 
pressure and temperature, and have 

compressibility = k = — I— 1 > 

thermal expansion = -( -r^ J • 

We shall thus compute these quantities. Now a solid, unlike a 
gas, behaves in a perfectly normal way at the absolute zero of 
temperature. Its volume is finite and definitely determined, 
being given from the equilibrium positions of its own atoms and 
molecules, which all pack closely enough to be in their equilibrium 
positions, since they have no kinetic energy. If external pres- 
sure is applied, the volume will decrease, and we can compute 
the compressibility. Temperature will not greatly change these 
quantities: temperature agitation slightly increases the volume, 
and makes the crystal more compressible, but these effects are 



small enough to be treated as perturbations of the state at 
absolute zero. Hence we begin by considering the crystal 
without temperature agitation. 

270. The Crystal at Absolute Zero. — The energy of a crystal 
at the absolute zero, when its atoms are in perfectly definite 
positions, is simply the sum of the interaction energies for all 
pairs of atoms. In the position of equilibrium, this energy must 
be a minimum with respect to any possible small deformation 
of the crystal. Thus each separate atom is in equilibrium with 
respect to a slight displacement, keeping all other atoms fixed, 
so that it is at a minimum of potential, and could execute vibra- 
tions about this position of equilibrium, which to a first approxi- 
mation would be simple harmonic. But there are other sorts of 
distortion to consider. For instance, we may decrease the whole 
volume slightly, moving the atoms closer together but preserving 
their relative arrangement, and the energy must be a minimum 
with respect to such a distortion. It is this which particularly 
interests us in computing compressibilities. Now in a very 
simple crystal lattice, if the volume is decreased, the atoms will 
still have just the same arrangement. Thus NaCl has a cubic 
lattice, Na and CI ions being found alternately at the corners 
of cubes, and squeezing the whole lattice would merely decrease 
the size of the cubes. The same thing is true of the simpler 
metals. It is easy to see that it is not always the case; a crystal 
composed of molecules rather loosely tied together would, under 
compression, have the molecules forced closer together without 
much change in the dimensions of each molecule. We do not 
consider such complicated cases, however, but rather assume that 
all interatomic distances r are proportional to the dimension 
5 of the crystal as a whole. 

Let us assume, then, a cube of crystal, of side 5, a quantity 
which depends on the pressure. Let this cube contain N atoms. 
Now let us assume that the potential energy of the force between 
two atoms at distance r is the sum of two terms: an attractive 
term, negative in sign, proportional to 1/r for ionic crystals, 
or exponential in r for valence crystals; and a repulsive term, 
positive, and varying exponentially with r. The total energy 
of the crystal is the sum of all interatomic potential terms. 
This sum, for the exponential terms, is easy to compute. For 
these terms fall off so rapidly that practically only the nearest 
neighbors contribute appreciable terms to the energy. Thus 


we simply take each of the N atoms, and sum up the exponential 
terms to its s nearest neighbors. This, as we readily see, gives 
each pair counted twice over, so that the sum is \NsAe~ ar , 
where r is the distance to the nearest neighbor, A and a are 
constants, or ^NsAe~ aS — Ce~ aS i where a = ar/8, since r is 
proportional to 8. For the inverse power attraction between 
ions, we cannot confine ourselves to nearest neighbors, since the 
forces fall off too slowly. Since each term in the energy is 
proportional to an inverse interatomic distance, and therefore 
to 1/6, however, the energy will likewise be proportional to 1/5, 
and the coefficient can be calculated by a proper method of 
summation over all ions. 

Having the total energy, it is easy to compute the compressi- 
bility. We consider the ionic crystal, where the energy has the 

-j + Ce- s . (1) 

We note that dE = —pdv, where E is the energy, p the pressure, 
v = 5 3 is the volume, so that 

dE dE d8 ( K . „ A 1 ,_ N 

To compute the compressibility, we note that 

dp = dpd8 = ^K _ C(a 2 + 2a/8)e- aS 
dv d5 dv ~ 95 7 95 4 

from which we get the compressibility by the definition k =* 

1 /fat 

7— Now we are interested in the properties of the solid at 

v dp 

zero pressure. Setting the expression above for p equal to zero, 

a particular value of 8 is determined, giving the volume at zero 

pressure. In turn, we substitute this into' the compressibility,